The term “dogma” describes a doctrine or code of beliefs accepted as authoritative. The central dogma of biology refers to the way that genetic information is stored and retrieved in living cells. The classic relationship is DNA > RNA > Protein. Thus DNA functions as the information storage molecule, and this information is “read out” into RNA molecules.
Some of these RNAs are intermediates and carry the information used to produce proteins. It is the proteins (and some RNAs) that are the “active” workers in the cell — catalyzing reactions, moving things around, creating structures, etc. Thus the information stored in DNA is the genotype (the sum of inheritable potential) and when this information is translated into RNA and protein, a phenotype (the sum of observable characteristics) is produced. The discovery of the structure of DNA by Watson and Crick in 1953 was a milestone for biology, leading to a molecular understanding of how the sequence of nucleotides making up the DNA molecule encodes information.
Historically, much of our knowledge of reactions occurring in cells has come from isolating and studying individual types of protein molecules. This resulted in the delineation of various metabolic pathways, signaling events, structural elements, etc. and eventually to tools for manipulating DNA itself. You will learn about these later in the course.
These methods have now allowed access to vast stores of genetic information. The Human Genome Project (begun in 1990, with a working draft completed 10 years later) led to the development of fast and accurate DNA sequence determination techniques. Over the past 20 years a huge quantity of sequence information has been generated. In 1996, scientists completed the total nucleotide sequence of DNA from yeast: about 12 million base pairs of DNA representing over 6000 genes were identified. Since then, the chromosomal DNA of many microbes has been sequenced (now over 400 organisms). A virtually complete sequence of human DNA was completed in 2002. The human genome consists of approximately 3.2 billion base pairs and encodes approximately 25,000 genes. However, we do not yet know the function of many of these genes. The power to manipulate DNA sequences gives us new ways to probe these functions and to answer questions about how cells work.
Since proteins are the active molecules in the cell and some of the key reagents in the technology behind molecular biology, we will begin the course with a discussion of their general properties. Proteins are made from building blocks called amino acids, which are strung together in long polymer chains. The chains fold and coil in three dimensions to achieve a structure with a biological function. Understanding this critical process requires an in-depth discussion of the various forces that stabilize a protein into a given conformation (shape).
Proteins are macromolecules with molecular weights ranging from about 5 kilodaltons (kDa) to several thousand kDa. A simple cell such as yeast contains about 6,000 different proteins. Many of these proteins are biological catalysts (enzymes), which catalyze a single chemical reaction in the cell but others serve a range of functions, In fact, proteins are the most diverse class of macromolecules, with a huge range of sizes, shapes, copy number, solubility, etc. as well as function but underlying this complexity is a very simple fundamental structure. All proteins are synthesized from combinations of some or all 20 amino acids. Basically, proteins are linear polymers of amino acids. To understand protein structure, we have to start with the amino acids.