Evolution of function in protein superfamilies, from a structural perspective.

The recent growth in protein databases has revealed the functional diversity of many protein superfamilies. We have assessed the functional variation of homologous enzyme superfamilies containing two or more enzymes, as defined by the CATH protein structure classification, by way of the Enzyme Commission (EC) scheme. Combining sequence and structure information to identify relatives, the majority of superfamilies display variation in enzyme function, with 25 % of superfamilies in the PDB having members of different enzyme types. We determined the extent of functional similarity at different levels of sequence identity for 486,000 homologous pairs (enzyme/enzyme and enzyme/non-enzyme), with structural and sequence relatives included. For single and multi-domain proteins, variation in EC number is rare above 40 % sequence identity, and above 30 %, the first three digits may be predicted with an accuracy of at least 90 %. For more distantly related proteins sharing less than 30 % sequence identity, functional variation is significant, and below this threshold, structural data are essential for understanding the molecular basis of observed functional differences. To explore the mechanisms for generating functional diversity during evolution, we have studied in detail 31 diverse structural enzyme superfamilies for which structural data are available. A large number of variations and peculiarities are observed, at the atomic level through to gross structural rearrangements. Almost all superfamilies exhibit functional diversity generated by local sequence variation and domain shuffling. Commonly, substrate specificity is diverse across a superfamily, whilst the reaction chemistry is maintained. In many superfamilies, the position of catalytic residues may vary despite playing equivalent functional roles in related proteins. The implications of functional diversity within supefamilies for the structural genomics projects are discussed. More detailed information on these superfamilies is available at http://www.biochem.ucl.ac.uk/bsm/FAM-EC/.

[1]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[2]  R. Jensen Enzyme recruitment in evolution of new function. , 1976, Annual review of microbiology.

[3]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[4]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[5]  D. Stammers,et al.  Electron density map of apoferritin at 2.8-Å resolution , 1978, Nature.

[6]  D. Rice,et al.  Ferritin: design and formation of an iron-storage molecule. , 1984, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[7]  J. S. Miles,et al.  Structural and functional relationships between fumarase and aspartase. Nucleotide sequences of the fumarase (fumC) and aspartase (aspA) genes of Escherichia coli K12. , 1986, The Biochemical journal.

[8]  K. Tasanen,et al.  Molecular cloning of the beta‐subunit of human prolyl 4‐hydroxylase. This subunit and protein disulphide isomerase are products of the same gene. , 1987, The EMBO journal.

[9]  L. Narhi,et al.  Identification and characterization of two functional domains in cytochrome P-450BM-3, a catalytically self-sufficient monooxygenase induced by barbiturates in Bacillus megaterium. , 1987, The Journal of biological chemistry.

[10]  J. Piatigorsky,et al.  Recruitment of enzymes as lens structural proteins. , 1987, Science.

[11]  E. Padlan,et al.  Three-dimensional structure of the tryptophan synthase alpha 2 beta 2 multienzyme complex from Salmonella typhimurium. , 1988, The Journal of biological chemistry.

[12]  R. Huber,et al.  Crystal structure determination, refinement and molecular model of creatine amidinohydrolase from Pseudomonas putida. , 1988, Journal of molecular biology.

[13]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[14]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[15]  P. Christen,et al.  Evolutionary relationships among aminotransferases , 1989 .

[16]  Hans Eklund,et al.  Three-dimensional structure of the free radical protein of ribonucleotide reductase , 1990, Nature.

[17]  N.,et al.  Protein disulfide isomerase is a component of the microsomal triglyceride transfer protein complex. , 1990, The Journal of biological chemistry.

[18]  G. Petsko Déjá vu all over again , 1991, Nature.

[19]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[20]  J. Piatigorsky,et al.  The recruitment of crystallins: new functions precede gene duplication , 1991, Science.

[21]  B Henrissat,et al.  A classification of glycosyl hydrolases based on amino acid sequence similarities. , 1991, The Biochemical journal.

[22]  J. Kuriyan,et al.  Convergent evolution of similar function in two structurally divergent enzymes , 1991, Nature.

[23]  F. Winkler,et al.  Pancreatic lipases: evolutionary intermediates in a positional change of catalytic carboxylates? , 1992, The Journal of biological chemistry.

[24]  M Wilmanns,et al.  Three-dimensional structure of the bifunctional enzyme phosphoribosylanthranilate isomerase: indoleglycerolphosphate synthase from Escherichia coli refined at 2.0 A resolution. , 1992, Journal of molecular biology.

[25]  Joel L. Sussman,et al.  The α/β hydrolase fold , 1992 .

[26]  C. Chothia One thousand families for the molecular biologist , 1992, Nature.

[27]  E. Webb Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. , 1992 .

[28]  G. Schneider,et al.  A thiamin diphosphate binding fold revealed by comparison of the crystal structures of transketolase, pyruvate oxidase and pyruvate decarboxylase. , 1993, Structure.

[29]  G. Petsko,et al.  On the origin of enzymatic species. , 1993, Trends in biochemical sciences.

[30]  A. Murzin Can homologous proteins evolve different enzymatic activities? , 1993, Trends in biochemical sciences.

[31]  A. Barrett [1] Classification of peptidases , 1994 .

[32]  A. Tarentino,et al.  Crystal structure of endo-beta-N-acetylglucosaminidase F1, an alpha/beta-barrel enzyme adapted for a complex substrate. , 1994, Biochemistry.

[33]  P. Lindley,et al.  The structure of avian eye lens δ-crystallin reveals a new fold for a superfamily of oligomeric enzymes , 1994, Nature Structural Biology.

[34]  L. L. Lloyd,et al.  Enzyme nomenclature — Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology: Academic Press Ltd, London, UK, 1992. xiii + 862 pp. Price £40.00. ISBN 0-12-227165-3 , 1994 .

[35]  Janet M. Thornton,et al.  Protein domain superfolds and superfamilies , 1994 .

[36]  M. Ashburner,et al.  FlyBase--the Drosophila genetic database. , 1994, Development.

[37]  THE STRUCTURE OF FLAVOCYTOCHROME C SULFIDE DEHYDROGENASE FROM A PURPLE PHOTOTROPHIC BACTERIUM CHROMATIUM VINOSUM AT 2.5 ANGSTROMS RESOLUTION , 1994 .

[38]  H M Holden,et al.  Three-dimensional structure of the biotin carboxylase subunit of acetyl-CoA carboxylase. , 1994, Biochemistry.

[39]  P. Christen,et al.  Evolutionary relationships among pyridoxal‐5′‐phosphate‐dependent enzymes , 1994 .

[40]  D. Ollis,et al.  Crystal structure of Escherichia coli QOR quinone oxidoreductase complexed with NADPH. , 1995, Journal of molecular biology.

[41]  B. Henrissat,et al.  Stereochemistry of Chitin Hydrolysis by a Plant Chitinase/Lysozyme and X-ray Structure of a Complex with Allosamidin , 2001 .

[42]  J. Tainer,et al.  Novel DNA binding motifs in the DNA repair enzyme endonuclease III crystal structure. , 1995, The EMBO journal.

[43]  T. Vernet,et al.  Redesigning the active site of Geotrichum candidum lipase. , 1995, Protein engineering.

[44]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[45]  J. Martin,et al.  Thioredoxin--a fold for all reasons. , 1995, Structure.

[46]  Cytochrome P450. , 1995, Current opinion in structural biology.

[47]  A G Murzin,et al.  Structural classification of proteins: new superfamilies. , 1996, Current opinion in structural biology.

[48]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[49]  B. Halkier Catalytic reactivities and structure/function relationships of cytochrome P450 enzymes , 1996 .

[50]  Georg A. Sprenger,et al.  Crystal structure of transaldolase B from Escherichia coli suggests a circular permutation of the α/β barrel within the class I aldolase family , 1996 .

[51]  J. V. Van Beeumen,et al.  Covalent structure of the flavoprotein subunit of the flavocytochrome c: Sulfide dehydrogenase from the purple phototrophic bacterium chromatium vinosum , 1996, Protein science : a publication of the Protein Society.

[52]  Catalytic reactivities and struc-ture/function relationships of cytochrome P490 enzymes , 1996 .

[53]  M. Hennig,et al.  The 1.8 A resolution structure of hevamine, a plant chitinase/lysozyme, and analysis of the conserved sequence and structure motifs of glycosyl hydrolase family 18. , 1996, Journal of Molecular Biology.

[54]  J M Thornton,et al.  Derivation of 3D coordinate templates for searching structural databases: Application to ser‐His‐Asp catalytic triads in the serine proteinases and lipases , 1996, Protein science : a publication of the Protein Society.

[55]  M. Murphy,et al.  Structural comparison of cupredoxin domains: Domain recycling to construct proteins with novel functions , 1997, Protein science : a publication of the Protein Society.

[56]  Peter Willett,et al.  A polymerase I palm in adenylyl cyclase? , 1997, Nature.

[57]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[58]  M. L. Jones,et al.  PDBsum: a Web-based database of summaries and analyses of all PDB structures. , 1997, Trends in biochemical sciences.

[59]  Patricia C. Babbitt,et al.  Understanding Enzyme Superfamilies , 1997, The Journal of Biological Chemistry.

[60]  M Czjzek,et al.  Atomic resolution (1.0 A) crystal structure of Fusarium solani cutinase: stereochemical analysis. , 1997, Journal of molecular biology.

[61]  Gapped BLAST and PSI-BLAST: A new , 1997 .

[62]  I. Rayment,et al.  Structure of carbamoyl phosphate synthetase: a journey of 96 A from substrate to product. , 1997, Biochemistry.

[63]  G. Farber,et al.  Evaluation of functionally important amino acids in L-aspartate ammonia-lyase from Escherichia coli. , 1997, Biochemistry.

[64]  C. Craik,et al.  Evolutionary Divergence of Substrate Specificity within the Chymotrypsin-like Serine Protease Fold* , 1997, The Journal of Biological Chemistry.

[65]  P C Babbitt,et al.  Evolution of an enzyme active site: the structure of a new crystal form of muconate lactonizing enzyme compared with mandelate racemase and enolase. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[66]  M. Sugantino,et al.  Structure of the hexapeptide xenobiotic acetyltransferase from Pseudomonas aeruginosa. , 1998, Biochemistry.

[67]  S. Kim,et al.  Structure-based assignment of the biochemical function of a hypothetical protein: a test case of structural genomics. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[68]  J. Tainer,et al.  MutY catalytic core, mutant and bound adenine structures define specificity for DNA repair enzyme superfamily , 1998, Nature Structural Biology.

[69]  Michael Y. Galperin,et al.  The catalytic domain of the P-type ATPase has the haloacid dehalogenase fold. , 1998, Trends in biochemical sciences.

[70]  M. Bafor,et al.  Identification of non-heme diiron proteins that catalyze triple bond and epoxy group formation. , 1998, Science.

[71]  P C Babbitt,et al.  Mechanistically diverse enzyme superfamilies: the importance of chemistry in the evolution of catalysis. , 1998, Current opinion in chemical biology.

[72]  C P Ponting,et al.  Protein fold irregularities that hinder sequence analysis. , 1998, Current opinion in structural biology.

[73]  E. Carpenter,et al.  Structure of dehydroquinate synthase reveals an active site capable of multistep catalysis , 1998, Nature.

[74]  Fan Yang,et al.  Crystal structure of Escherichia coli HdeA , 1998, Nature Structural Biology.

[75]  Y. Satow,et al.  Three-dimensional structure of Escherichia coli glutathione S-transferase complexed with glutathione sulfonate: catalytic roles of Cys10 and His106. , 1998, Journal of molecular biology.

[76]  A. Dunker,et al.  Crystal structure of calsequestrin from rabbit skeletal muscle sarcoplasmic reticulum , 1998, Nature Structural Biology.

[77]  M J Sternberg,et al.  Supersites within superfolds. Binding site similarity in the absence of homology. , 1998, Journal of molecular biology.

[78]  C Colovos,et al.  The 1.8 A crystal structure of the ycaC gene product from Escherichia coli reveals an octameric hydrolase of unknown specificity. , 1998, Structure.

[79]  C. Orengo,et al.  Protein folds and functions. , 1998, Structure.

[80]  A protein disulfide oxidoreductase from the archaeon Pyrococcus furiosus contains two thioredoxin fold units , 1998, Nature Structural Biology.

[81]  ECOLI SODF,et al.  Analogous Enzymes : Independent Inventions in Enzyme Evolution , 1998 .

[82]  A. Murzin How far divergent evolution goes in proteins. , 1998, Current opinion in structural biology.

[83]  G. Petsko,et al.  The crystal structure of benzoylformate decarboxylase at 1.6 A resolution: diversity of catalytic residues in thiamin diphosphate-dependent enzymes. , 1998, Biochemistry.

[84]  R. Kolter,et al.  The crystal structure of Dps, a ferritin homolog that binds and protects DNA , 1998, Nature Structural Biology.

[85]  K. Wilson,et al.  The structure of SAICAR synthase: an enzyme in the de novo pathway of purine nucleotide biosynthesis. , 1998, Structure.

[86]  L. Miercke,et al.  Structure of bovine pancreatic cholesterol esterase at 1.6 A: novel structural features involved in lipase activation. , 1998, Biochemistry.

[87]  N. Grishin,et al.  The Zn-peptidase superfamily: functional convergence after evolutionary divergence. , 1999, Journal of molecular biology.

[88]  A. Fiser,et al.  Convergent evolution of Trichomonas vaginalis lactate dehydrogenase from malate dehydrogenase. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[89]  Insights into the mechanism of Escherichia coli methionine aminopeptidase from the structural analysis of reaction products and phosphorus-based transition-state analogues. , 1999, Biochemistry.

[90]  A. Volbeda,et al.  Crystal structures of the key anaerobic enzyme pyruvate:ferredoxin oxidoreductase, free and in complex with pyruvate , 1999, Nature Structural Biology.

[91]  C. Orengo CORA—Topological fingerprints for protein structural families , 2008, Protein science : a publication of the Protein Society.

[92]  C. Schofield,et al.  Structural and mechanistic studies on 2-oxoglutarate-dependent oxygenases and related enzymes. , 1999, Current opinion in structural biology.

[93]  R. E. Huber,et al.  Structural comparisons of TIM barrel proteins suggest functional and evolutionary relationships between β‐galactosidase and other glycohydrolases , 2008, Protein science : a publication of the Protein Society.

[94]  K. Volz A test case for structure‐based functional assignment: The 1.2 Å crystal structure of the yjgF gene product from Escherichia coli , 2008, Protein science : a publication of the Protein Society.

[95]  C. Raetz,et al.  The active site of Escherichia coli UDP-N-acetylglucosamine acyltransferase. Chemical modification and site-directed mutagenesis. , 1999, The Journal of biological chemistry.

[96]  A. Goldman,et al.  Of barn owls and bankers: a lush variety of α/β hydrolases , 1999 .

[97]  The schiff base complex of yeast 5‐aminolaevulinic acid dehydratase with laevulinic acid , 1999, Protein science : a publication of the Protein Society.

[98]  A. Goldman,et al.  Of barn owls and bankers: a lush variety of alpha/beta hydrolases. , 1999, Structure.

[99]  D. Herschlag,et al.  Catalytic promiscuity and the evolution of new enzymatic activities. , 1999, Chemistry & biology.

[100]  J. Tainer,et al.  DNA repair mechanisms for the recognition and removal of damaged DNA bases. , 1999, Annual review of biophysics and biomolecular structure.

[101]  J. Rossjohn,et al.  Molecular basis of glutathione synthetase deficiency and a rare gene permutation event , 1999, The EMBO journal.

[102]  A Kihara,et al.  Three-dimensional structure of phosphoenolpyruvate carboxylase: a proposed mechanism for allosteric inhibition. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[103]  L Rychlewski,et al.  From fold predictions to function predictions: Automation of functional site conservation analysis for functional genome predictions , 1999, Protein science : a publication of the Protein Society.

[104]  M. Gerstein,et al.  The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. , 1999, Journal of molecular biology.

[105]  W. Hol,et al.  Crystal structure of 2-oxoisovalerate and dehydrogenase and the architecture of 2-oxo acid dehydrogenase multienzyme complexes , 1999, Nature Structural Biology.

[106]  M. Nardini,et al.  α/β Hydrolase fold enzymes : the family keeps growing , 1999 .

[107]  P. Bork,et al.  Homology among (betaalpha)(8) barrels: implications for the evolution of metabolic pathways. , 2000, Journal of molecular biology.

[108]  A. Valencia,et al.  Practical limits of function prediction , 2000, Proteins.

[109]  M. Gerstein,et al.  Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. , 2000, Journal of molecular biology.

[110]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[111]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[112]  Michael Y. Galperin,et al.  Aldolases of the DhnA family: a possible solution to the problem of pentose and hexose biosynthesis in archaea. , 2000, FEMS microbiology letters.

[113]  Amos Bairoch,et al.  The ENZYME database in 2000 , 2000, Nucleic Acids Res..

[114]  M. Fraaije,et al.  Flavoenzymes: diverse catalysts with recurrent features. , 2000, Trends in biochemical sciences.

[115]  Two tricks in one bundle: helix-turn-helix gains enzymatic activity. , 2000, Nucleic acids research.