Rational genomics I: Antisense open reading frames and codon bias in short‐chain oxido reductase enzymes and the evolution of the genetic code

The short‐chain oxidoreductase (SCOR) family of enzymes includes over 6000 members, extending from bacteria and archaea to humans. Nucleic acid sequence analysis reveals that significant numbers of these genes are remarkably free of stopcodons in reading frames other than the coding frame, including those on the antisense strand. The genes from this subset also use almost entirely the GC‐rich half of the 64 codons. Analysis of a million hypothetical genes having random nucleotide composition shows that the percentage of SCOR genes having multiple open reading frames exceeds random by a factor of as much as 1 × 106. Nevertheless, screening the content of the SWISS‐PROT TrEMBL database reveals that 15% of all genes contain multiple open reading frames. The SCOR genes having multiple open reading frames and a GC‐rich coding bias exhibit a similar GC bias in the nucleotide triple composition of their DNA. This bias is not correlated with the GC content of the species in which the SCOR genes are found. One possible explanation for the conservation of multiple open reading frames and extreme bias in nucleic acid composition in the family of Rossman folds is that the primordial member of this family was encoded early using only very stable GC‐rich DNA and that evolution proceeded with extremely limited introduction of any codons having two or more adenine or thymine nucleotides. These and other data suggest that the SCOR family of enzymes may even have diverged from a common ancestor before most of the AT‐rich half of the genetic code was fully defined. Proteins 2005. © 2005 Wiley‐Liss, Inc.

[1]  D. Ghosh,et al.  The refined three-dimensional structure of 3α,20β-hydroxysteroid dehydrogenase and possible roles of the residues conserved in short-chain dehydrogenases , 1994 .

[2]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[3]  Jeremy Bruenn,et al.  Rational proteomics I. Fingerprint identification and cofactor specificity in the short‐chain oxidoreductase (SCOR) enzyme family , 2003, Proteins.

[4]  Charles W Carter,et al.  Did tRNA synthetase classes arise on opposite strands of the same gene? , 2002, Molecular cell.

[5]  F. Bolivar,et al.  Antisense overlapping open reading frames in genes from bacteria to humans. , 1994, Nucleic acids research.

[6]  D. Forsdyke,et al.  Relative roles of primary sequence and (G + C)% in determining the hierarchy of frequencies of complementary trinucleotide pairs in DNAs of different species , 1995, Journal of Molecular Evolution.

[7]  Linda,et al.  Molecular Characterization of an NAD-specific Glutamate Dehydrogenase Gene Inducible by L-Glutamine , 2001 .

[8]  B. Yang,et al.  Molecular characterization of an NAD-specific glutamate dehydrogenase gene inducible by L-glutamine. Antisense gene pair arrangement with L-glutamine-inducible heat shock 70-like protein gene. , 1994, The Journal of biological chemistry.

[9]  Igor N. Berezovsky,et al.  Distinct Stages of Protein Evolution as Suggested by Protein Sequence Analysis , 2001, Journal of Molecular Evolution.

[10]  Erez Y. Levanon,et al.  Widespread occurrence of antisense transcription in the human genome , 2003, Nature Biotechnology.

[11]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[12]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[13]  C. Sander,et al.  A Drosophila hsp70 gene contains long, antiparallel, coupled open reading frames (LAC ORFs) conserved in homologous loci , 1995, Journal of Molecular Evolution.

[14]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[15]  S. Williams,et al.  Cloning and analysis of a constitutive heat shock (cognate) protein 70 gene inducible by L-glutamine. , 1994, The Journal of biological chemistry.

[16]  E N Trifonov,et al.  Consensus temporal order of amino acids and evolution of the triplet code. , 2000, Gene.

[17]  P. Doty,et al.  Determination of the base composition of deoxyribonucleic acid from its thermal denaturation temperature. , 1962, Journal of molecular biology.

[18]  M. Kimura,et al.  The role of robustness and changeability on the origin and evolution of genetic codes. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[19]  A. Tropsha,et al.  Autoimmunity is triggered by cPR-3(105–201), a protein complementary to human autoantigen proteinase-3 , 2004, Nature Medicine.

[20]  B. Yang,et al.  NADP(+)-activable, NAD(+)-specific glutamate dehydrogenase. Purification and immunological analysis. , 1994, Journal of Biological Chemistry.