Solvent exposure imparts similar selective pressures across a range of yeast proteins.

We study how an amino acid residue's solvent exposure influences its propensity for substitution by analyzing multiple alignments of 61 yeast genes for which the crystal structure is known. We find that the selective constraint on the interior residues is on average 10 times that of residues on the surface. Surprisingly, there is no correlation between the overall selective constraint observed for a protein alignment and the ratio of constraints on interior and surface residues. By modeling the selective constraint on several amino acid properties, we show that although residue volume and hydropathy are strongly conserved across most alignments, there is little variation in interior versus surface conservation for these two properties. By contrast, residue charge (isoelectric point) is less generally conserved when considering the protein as a whole but shows a strong constraint against the introduction of charged residues into the protein interior.

[1]  B. Birren,et al.  Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae , 2004, Nature.

[2]  Z. Yang,et al.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. , 1993, Molecular biology and evolution.

[3]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[4]  Kevin P. Byrne,et al.  Independent sorting-out of thousands of duplicated gene pairs in two yeast species descended from a whole-genome duplication , 2007, Proceedings of the National Academy of Sciences.

[5]  David S. Wishart,et al.  MovieMaker: a web server for rapid rendering of protein motions and interactions , 2005, Nucleic Acids Res..

[6]  Andreas Wagner,et al.  Molecular evolution in large genetic networks: connectivity does not equal importance , 2004 .

[7]  George Newport,et al.  The diploid genome sequence of Candida albicans. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[8]  B. Birren,et al.  Sequencing and comparison of yeast species to identify genes and regulatory elements , 2003, Nature.

[9]  S. Steinbacher,et al.  β‐Turn propensities as paradigms for the analysis of structural motifs to engineer protein stability , 1997, Protein science : a publication of the Protein Society.

[10]  Thomas G. Mitchell,et al.  Phylogeny and Evolution of Medical Species of Candida and Related Taxa: a Multigenic Analysis , 2004, Journal of Clinical Microbiology.

[11]  Z. Weng,et al.  Structure, function, and evolution of transient and obligate protein-protein interactions. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[12]  N. Goldman,et al.  Codon-substitution models for heterogeneous selection pressure at amino acid sites. , 2000, Genetics.

[13]  C. Chothia The nature of the accessible and buried surfaces in proteins. , 1976, Journal of molecular biology.

[14]  A V Finkelstein,et al.  The classification and origins of protein folding patterns. , 1990, Annual review of biochemistry.

[15]  B. Dujon,et al.  Genome evolution in yeasts , 2004, Nature.

[16]  Kevin P. Byrne,et al.  The Yeast Gene Order Browser: combining curated homology and syntenic context reveals gene fate in polyploid species. , 2005, Genome research.

[17]  C. Kurtzman,et al.  Phylogenetic relationships among yeasts of the 'Saccharomyces complex' determined from multigene sequence analyses. , 2003, FEMS yeast research.

[18]  R. Nielsen,et al.  Detecting Site-Specific Physicochemical Selective Pressures: Applications to the Class I HLA of the Human Major Histocompatibility Complex and the SRK of the Plant Sporophytic Self-Incompatibility System , 2005, Journal of Molecular Evolution.

[19]  B. Barrell,et al.  Life with 6000 Genes , 1996, Science.

[20]  M. Lehmann,et al.  From DNA sequence to improved functionality: using protein sequence comparisons to rapidly design a thermostable consensus phytase. , 2000, Protein engineering.

[21]  Frank Eisenhaber,et al.  Improved strategy in analytic surface calculation for molecular systems: Handling of singularities and computational efficiency , 1993, J. Comput. Chem..

[22]  Chris Sander,et al.  The double cubic lattice method: Efficient approaches to numerical integration of surface area and volume and to dot surface contouring of molecular assemblies , 1995, J. Comput. Chem..

[23]  Frances H Arnold,et al.  Structural determinants of the rate of protein evolution in yeast. , 2006, Molecular biology and evolution.

[24]  Wendy S. W. Wong,et al.  Identification of physicochemical selective pressure on protein encoding nucleotide sequences , 2006, BMC Bioinformatics.

[25]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[26]  R. Nielsen,et al.  Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. , 1998, Genetics.

[27]  W. Li,et al.  Selective constraints, amino acid composition, and the rate of protein evolution. , 2000, Molecular biology and evolution.

[28]  K. H. Wolfe,et al.  Probabilistic Cross-Species Inference of Orthologous Genomic Regions Created by Whole-Genome Duplication in Yeast , 2008, Genetics.

[29]  C. Wilke,et al.  Why highly expressed proteins evolve slowly. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Kevin P. Byrne,et al.  Multiple rounds of speciation associated with reciprocal gene loss in polyploid yeasts , 2006, Nature.

[31]  Z. Yang,et al.  Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. , 2000, Molecular biology and evolution.

[32]  D. Hartl,et al.  Solvent accessibility and purifying selection within proteins of Escherichia coli and Salmonella enterica. , 2000, Molecular biology and evolution.

[33]  P. Philippsen,et al.  The Ashbya gossypii Genome as a Tool for Mapping the Ancient Saccharomyces cerevisiae Genome , 2004, Science.

[34]  C. Wilke,et al.  A single determinant dominates the rate of yeast protein evolution. , 2006, Molecular biology and evolution.

[35]  E. Koonin,et al.  Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. , 2002, Genome research.

[36]  Marek S. Skrzypek,et al.  The Candida Genome Database (CGD), a community resource for Candida albicans gene and protein information , 2004, Nucleic Acids Res..

[37]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[38]  Martin Lehmann,et al.  The consensus concept for thermostability engineering of proteins: further proof of concept. , 2002, Protein engineering.

[39]  David C. Jones,et al.  Combining protein evolution and secondary structure. , 1996, Molecular biology and evolution.

[40]  Terence P. Speed,et al.  Estimating the fraction of invariable codons with a capture-recapture method , 1992, Journal of Molecular Evolution.

[41]  Matthew W. Hahn,et al.  Molecular Evolution in Large Genetic Networks: Does Connectivity Equal Constraint? , 2004, Journal of Molecular Evolution.

[42]  Peter F Stadler,et al.  Modeling amino acid substitution patterns in orthologous and paralogous genes. , 2007, Molecular phylogenetics and evolution.

[43]  C. Pál,et al.  Genomic function: Rate of evolution and gene dispensability. , 2003, Nature.

[44]  David C. Jones,et al.  Assessing the impact of secondary structure and solvent accessibility on protein evolution. , 1998, Genetics.

[45]  J. Reeves,et al.  Heterogeneity in the substitution process of amino acid sites of proteins coded for by mitochondrial DNA , 1992, Journal of Molecular Evolution.

[46]  D. Ryu,et al.  Recent Progress in Biomolecular Engineering , 2000, Biotechnology progress.

[47]  Ziheng Yang Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods , 1994, Journal of Molecular Evolution.

[48]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[49]  S. Muse,et al.  A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. , 1994, Molecular biology and evolution.

[50]  K. H. Wolfe,et al.  Molecular evidence for an ancient duplication of the entire yeast genome , 1997, Nature.

[51]  Eugene V Koonin,et al.  No simple dependence between protein evolution rate and the number of protein-protein interactions: only the most prolific interactors tend to evolve slowly , 2003, BMC Evolutionary Biology.

[52]  Z. Yang,et al.  Maximum-likelihood analysis of molecular adaptation in abalone sperm lysin reveals variable selective pressures among lineages and sites. , 2000, Molecular biology and evolution.

[53]  R. Greenberg Biometry , 1969, The Yale Journal of Biology and Medicine.