The Pattern of Amino Acid Replacements in α/β-Barrels

The determinants of site-to-site variability in the rate of amino acid replacement in alpha/beta-barrel enzyme structures are investigated. Of 125 available alpha/beta-barrel structures, only 25 meet a variety of phylogenetic and statistical criteria necessary to ensure sufficient data for reliable analysis. These 25 enzyme structures (from a wide variety of taxa with diverse lifestyles in diverse habitats) differ greatly in size, number, and topology of domains in addition to the alpha/beta-barrel, quaternary structure, metabolic role, reaction catalyzed, presence of prosthetic groups, regulatory mechanisms, use of cofactors, and catalytic mechanisms. Yet, with the exception of ribulose-1,5-bisphosphate carboxylase, all structures have similar frequency distributions of amino acid replacement rates. Hence, site-specific variability in rates of evolution is largely independent of differences in biology, biochemistry, and molecular structure. A correlation between site-specific rate variation and (1) distance from the active site, (2) solvent accessibility, and (3) treating glycines in unusual main-chain conformations as a separate class, explains approximately half the causal variation. Secondary structure exerts little influence on the pattern and distribution of replacements. Additional domains and subunits, side-chain hydrogen bonds, unusual side-chain rotamers, nonplanar peptide bonds, strained main-chain conformations, and buried hydrophilic-charged residues contribute little to variability among sites because they are rare. Nonlinear models do not improve the fits. In several enzymes, deviations from the typical pattern of replacements suggest the possible action of natural selection. A statistical analysis shows that, in all cases, much of the remaining unexplained variation is not attributable to chance and that other, as yet unidentified, causal relations must exist.

[1]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[2]  Patricia C. Babbitt,et al.  Understanding Enzyme Superfamilies , 1997, The Journal of Biological Chemistry.

[3]  M. Kimura The Neutral Theory of Molecular Evolution: Introduction , 1983 .

[4]  Thomas Uzzell,et al.  Fitting Discrete Probability Distributions to Evolutionary Events , 1971, Science.

[5]  M. Lewis,et al.  Comparative anatomy of the aldo-keto reductase superfamily. , 1997, The Biochemical journal.

[6]  M. Nei,et al.  Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection , 1988, Nature.

[7]  P. Lio’,et al.  Using protein structural information in evolutionary inference: transmembrane proteins. , 1999, Molecular biology and evolution.

[8]  W R Taylor,et al.  Coevolving protein residues: maximum likelihood identification and relationship to structure. , 1999, Journal of molecular biology.

[9]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[10]  R. A. Fisher,et al.  The Genetical Theory of Natural Selection , 1931 .

[11]  Ziheng Yang,et al.  Statistical methods for detecting molecular adaptation , 2000, Trends in Ecology & Evolution.

[12]  P. Bork,et al.  Homology among (betaalpha)(8) barrels: implications for the evolution of metabolic pathways. , 2000, Journal of molecular biology.

[13]  M. Kendall,et al.  The advanced theory of statistics , 1945 .

[14]  X. Gu,et al.  Statistical methods for testing functional divergence after gene duplication. , 1999, Molecular biology and evolution.

[15]  J. Thompson,et al.  Using CLUSTAL for multiple sequence alignments. , 1996, Methods in enzymology.

[16]  J. Gerlt,et al.  New wine from old barrels , 2000, Nature Structural Biology.

[17]  W. Atchley,et al.  Correlations among amino acid sites in bHLH protein domains: an information theoretic analysis. , 2000, Molecular biology and evolution.

[18]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[19]  M. Kimura,et al.  The neutral theory of molecular evolution. , 1983, Scientific American.

[20]  M. Nei,et al.  Relative efficiencies of the maximum-likelihood, neighbor-joining, and maximum-parsimony methods when substitution rate varies with site. , 1994, Molecular biology and evolution.

[21]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[22]  David C. Jones,et al.  Assessing the impact of secondary structure and solvent accessibility on protein evolution. , 1998, Genetics.

[23]  William R. Atchley,et al.  Positional Dependence, Cliques, and Predictive Motifs in the bHLH Protein Domain , 1999, Journal of Molecular Evolution.

[24]  M. Miyamoto,et al.  Constraints on protein evolution and the age of the eubacteria/eukaryote split. , 1996, Systematic biology.

[25]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[26]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[27]  J. Zhang,et al.  A simple method for estimating the parameter of substitution rate variation among sites. , 1997, Molecular biology and evolution.

[28]  D. Hartl,et al.  Solvent accessibility and purifying selection within proteins of Escherichia coli and Salmonella enterica. , 2000, Molecular biology and evolution.

[29]  Haiching Ma,et al.  Conversion of mammalian 3alpha-hydroxysteroid dehydrogenase to 20alpha-hydroxysteroid dehydrogenase using loop chimeras: changing specificity from androgens to progestins. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[30]  A. Dean,et al.  Enzyme evolution explained (sort of). , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[31]  D Fischer,et al.  Analysis of heregulin symmetry by weighted evolutionary tracing. , 1999, Protein engineering.

[32]  J G Bishop,et al.  Rapid evolution in plant chitinases: molecular targets of selection in plant-pathogen coevolution. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[33]  C. Walsh,et al.  Enzymatic Reaction Mechanisms , 1978 .

[34]  B. Henrissat,et al.  Stereochemistry of Chitin Hydrolysis by a Plant Chitinase/Lysozyme and X-ray Structure of a Complex with Allosamidin , 2001 .

[35]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[36]  J. Thorne,et al.  Models of protein sequence evolution and their applications. , 2000, Current opinion in genetics & development.

[37]  X. Gu,et al.  Maximum-likelihood approach for gene family evolution under functional divergence. , 2001, Molecular biology and evolution.

[38]  Tim J. P. Hubbard,et al.  SCOP: a structural classification of proteins database , 1998, Nucleic Acids Res..

[39]  T. Ohta,et al.  Mutation and evolution at the molecular level. , 1972, Genetics.

[40]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[41]  P. Bork,et al.  Homology among (βα) 8 barrels: implications for the evolution of metabolic pathways 1 1Edited by G. Von Heijne , 2000 .