The Genetic Code

(a) The nature of the problem Genes are made of nucleic acid. Enzymes are made of protein. The amino acid sequence of a particular protein is synthesized under instruction from a particular piece of nucleic acid. Each protein is made of one or more polypeptide chains, synthesized by con? densing together amino acids, head to tail, with the elimination of water. A typical polypeptide chain is several hundred amino acid residues long. Nevertheless only twenty different kinds of amino acids are commonly found in proteins. This stan? dard set of twenty is the same throughout nature. Nucleic acid is made of polynucleotide chains. The repeating unit of the chain is a sugar (ribose for RNA, deoxyribose for DNA) connected to a phosphate. A base is joined on to each sugar. There are four common bases in nucleic acid. DNA usually has adenine, guanine, cytosine and thymine. In RNA thymine is replaced by uracil. Thus protein is written in a twenty-letter language, nucleic acid in a fourletter language. The genetic code is the dictionary which connects the two lan? guages. The group of bases which codes one amino acid is called a codon. It is now known that each codon consists of three adjacent bases. Notice that as far as we know the cell can translate in one direction only, from nucleic acid to protein, not from protein to nucleic acid. This hypothesis is known as the Central Dogma. If one could compare the base sequence of a long piece of nucleic acid with the amino acid sequence for which it codes, the genetic code could easily be deduced. Unfortunately this direct approach is not yet possible because of the technical difficulties in determining a long nucleotide sequence. Consequently more indirect approaches must be used.