Selection on protein structure, interaction, and sequence

Characterizing the probabilities of observing amino acid substitutions at specific sites in a protein over evolutionary time is a major goal in the field of molecular evolution. While purely statistical approaches at different levels of complexity exist, approaches rooted in underlying biological processes are necessary to characterize both the context‐dependence of sequence changes (epistasis) and to extrapolate to sequences not observed in biological databases. To develop such approaches, an understanding of the different selective forces that act on amino acid substitution is necessary. Here, an overview of selection on and corresponding modeling of folding stability, folding specificity, binding affinity and specificity for ligands, the evolution of new binding sites on protein surfaces, protein dynamics, intrinsic disorder, and protein aggregation as well as the interplay with protein expression level (concentration) and biased mutational processes are presented.

[1]  Julian Echave,et al.  Exploring the common dynamics of homologous proteins. Application to the globin family. , 2005, Biophysical journal.

[2]  Jim Pfaendtner,et al.  A systematic methodology for defining coarse-grained sites in large biomolecules. , 2008, Biophysical journal.

[3]  M. Miyamoto,et al.  Evolutionary, structural and biochemical evidence for a new interaction site of the leptin obesity protein. , 2003, Genetics.

[4]  M. Levitt A simplified representation of protein conformations for rapid simulation of protein folding. , 1976, Journal of molecular biology.

[5]  A. Ortiz,et al.  Effective connectivity profile: A structural representation that evidences the relationship between protein structures and sequences , 2008, Proteins.

[6]  Nancy M. Amato,et al.  Decoy Database Improvement for Protein Folding , 2015, J. Comput. Biol..

[7]  Lucy J. Colwell,et al.  The interface of protein structure, protein biophysics, and molecular evolution , 2012, Protein science : a publication of the Protein Society.

[8]  L. Vitagliano,et al.  Subtle functional collective motions in pancreatic‐like ribonucleases: From ribonuclease A to angiogenin , 2003, Proteins.

[9]  Claus O Wilke,et al.  Population Genetics of Translational Robustness , 2005, Genetics.

[10]  H. Dyson,et al.  Intrinsically unstructured proteins and their functions , 2005, Nature Reviews Molecular Cell Biology.

[11]  M. Lynch The frailty of adaptive hypotheses for the origins of organismal complexity , 2007, Proceedings of the National Academy of Sciences.

[12]  Johan A. Grahnen,et al.  Binding constraints on the evolution of enzymes and signalling proteins: the important role of negative pleiotropy , 2011, Proceedings of the Royal Society B: Biological Sciences.

[13]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[14]  Ivet Bahar,et al.  On the Conservation of the Slow Conformational Dynamics within the Amino Acid Kinase Family: NAGK the Paradigm , 2010, PLoS Comput. Biol..

[15]  Kevin W Plaxco,et al.  NMR and temperature-jump measurements of de novo designed proteins demonstrate rapid folding in the absence of explicit selection for kinetics. , 2003, Journal of molecular biology.

[16]  J. Richardson,et al.  Natural β-sheet proteins use negative design to avoid edge-to-edge aggregation , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  P. Wolynes,et al.  Protein tertiary structure recognition using optimized Hamiltonians with local interactions. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Subhajyoti De,et al.  Cellular crowding imposes global constraints on the chemistry and evolution of proteomes , 2012, Proceedings of the National Academy of Sciences.

[19]  C. Brown,et al.  Intrinsic protein disorder in complete genomes. , 2000, Genome informatics. Workshop on Genome Informatics.

[20]  D. Case,et al.  Flexibility of an Antibody Binding Site Measured with Photon Echo Spectroscopy , 2002 .

[21]  F. Arnold,et al.  A structural view of evolutionary divergence. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[22]  D. Baker,et al.  Functional rapidly folding proteins from simplified amino acid sequences , 1997, Nature Structural Biology.

[23]  Jian-Rong Yang,et al.  Determinants of the rate of protein sequence evolution , 2015, Nature Reviews Genetics.

[24]  Lippincott-Schwartz,et al.  Supporting Online Material Materials and Methods Som Text Figs. S1 to S8 Table S1 Movies S1 to S3 a " Silent " Polymorphism in the Mdr1 Gene Changes Substrate Specificity Corrected 30 November 2007; See Last Page , 2022 .

[25]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[26]  J. Echave,et al.  Evolutionary Conservation of Protein Backbone Flexibility , 2006, Journal of Molecular Evolution.

[27]  Angelo Pavesi,et al.  Viral Proteins Originated De Novo by Overprinting Can Be Identified by Codon Usage: Application to the “Gene Nursery” of Deltaretroviruses , 2013, PLoS Comput. Biol..

[28]  Tobias Warnecke,et al.  Why there is more to protein evolution than protein function: splicing, nucleosomes and dual-coding sequence. , 2009, Biochemical Society transactions.

[29]  A. Halpern,et al.  Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies. , 1998, Molecular biology and evolution.

[30]  C. Wilke,et al.  The look-ahead effect of phenotypic mutations , 2007, Biology Direct.

[31]  Wei Yang,et al.  Protein–nucleic acid interactions: from A(rgonaute) to X(PF) , 2006 .

[32]  D. Otzen,et al.  Designed protein tetramer zipped together with a hydrophobic Alzheimer homology: a structural clue to amyloid assembly. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Ruth Nussinov,et al.  Allostery without a conformational change? Revisiting the paradigm. , 2015, Current opinion in structural biology.

[34]  Adrian W. R. Serohijos,et al.  The Influence of Selection for Protein Stability on dN/dS Estimations , 2014, Genome biology and evolution.

[35]  Johan A. Grahnen,et al.  Biophysical and structural considerations for protein sequence evolution , 2011, BMC Evolutionary Biology.

[36]  Johan A. Grahnen,et al.  The Evolution of Protein Structures and Structural Ensembles Under Functional Constraint , 2011, Genes.

[37]  D Baker,et al.  The sequences of small proteins are not extensively optimized for rapid folding by natural selection. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Joseph W. Thornton,et al.  Resurrecting ancient genes: experimental analysis of extinct molecules , 2004, Nature Reviews Genetics.

[39]  Alexander Miguel Monzon,et al.  Conformational diversity and the emergence of sequence signatures during evolution. , 2015, Current opinion in structural biology.

[40]  D. M. Taverna,et al.  Why are proteins marginally stable? , 2002, Proteins.

[41]  Hervé Philippe,et al.  Statistical potentials for improved structurally constrained evolutionary models. , 2010, Molecular biology and evolution.

[42]  David T. Jones,et al.  Protein evolution with dependence among codons due to tertiary structure. , 2003, Molecular biology and evolution.

[43]  Joshua B. Plotkin,et al.  Codon Usage and Selection on Proteins , 2006, Journal of Molecular Evolution.

[44]  Eric A. Ortlund,et al.  Evolution of Minimal Specificity and Promiscuity in Steroid Hormone Receptors , 2012, PLoS genetics.

[45]  Paul D. Williams,et al.  Functionality and the evolution of marginal stability in proteins: Inferences from lattice simulations , 2006, Evolutionary bioinformatics online.

[46]  Gideon Schreiber,et al.  Protein binding specificity versus promiscuity. , 2010, Current opinion in structural biology.

[47]  J. Echave,et al.  Evolutionary conservation of protein vibrational dynamics. , 2008, Gene.

[48]  Jörg Gsponer,et al.  Intrinsically disordered proteins: regulation and disease. , 2011, Current opinion in structural biology.

[49]  John Verzani Populations , 2018, Using R for Introductory Statistics.

[50]  D. Liberles Ancestral sequence reconstruction , 2007 .

[51]  Jan Kubelka,et al.  A Phylogenetic Analysis of Normal Modes Evolution in Enzymes and its Relationship to Enzyme Function , 2012 .

[52]  David Baker,et al.  A de novo protein binding pair by computational design and directed evolution. , 2011, Molecular cell.

[53]  F. N. Braun,et al.  Repeat-Modulated Population Genetic Effects in Fungal Proteins , 2004, Journal of Molecular Evolution.

[54]  N. Go,et al.  Dynamics of a small globular protein in terms of low-frequency vibrational modes. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[55]  M. Karplus,et al.  Harmonic dynamics of proteins: normal modes and fluctuations in bovine pancreatic trypsin inhibitor. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[56]  A. Hobolth,et al.  Quantifying the impact of protein tertiary structure on molecular evolution. , 2007, Molecular biology and evolution.

[57]  A. Wagner,et al.  Mistranslation drives the evolution of robustness in TEM-1 β-lactamase , 2015, Proceedings of the National Academy of Sciences.

[58]  Jian-Rong Yang,et al.  Protein misinteraction avoidance causes highly expressed proteins to evolve slowly , 2012, Proceedings of the National Academy of Sciences.

[59]  P. Biggin,et al.  Comparative molecular dynamics—Similar folds and similar motions? , 2005, Proteins.

[60]  Richard A. Goldstein,et al.  Population Size Dependence of Fitness Effect Distribution and Substitution Rate Probed by Biophysical Model of Protein Thermostability , 2013, Genome biology and evolution.

[61]  Hervé Philippe,et al.  BMC Bioinformatics BioMed Central Methodology article A maximum likelihood framework for protein design , 2006 .

[62]  P. Chacón,et al.  Thorough validation of protein normal mode analysis: a comparative study with essential dynamics. , 2007, Structure.

[63]  H. Kalbitzer,et al.  High-Resolution Structure of the Histidine-Containing Phosphocarrier Protein (HPr) from Staphylococcus aureus and Characterization of Its Interaction with the Bifunctional HPr Kinase/Phosphorylase , 2004, Journal of bacteriology.

[64]  L. Serrano,et al.  Protein aggregation and amyloidosis: confusion of the kinds? , 2006, Current opinion in structural biology.

[65]  Michael K Gilson,et al.  Protein folding and binding: from biology to physics and back again. , 2011, Current opinion in structural biology.

[66]  L. Serrano,et al.  A comparative study of the relationship between protein structure and beta-aggregation in globular and intrinsically disordered proteins. , 2004, Journal of molecular biology.

[67]  D. Hartl,et al.  Limits of adaptation: the evolution of selective neutrality. , 1985, Genetics.

[68]  R. Jimenez,et al.  Protein dynamics and the immunological evolution of molecular recognition. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[69]  Ron Unger,et al.  Trade-off between Positive and Negative Design of Protein Stability: From Lattice Models to Real Proteins , 2009, PLoS Comput. Biol..

[70]  H. Gohlke,et al.  Large‐scale comparison of protein essential dynamics from molecular dynamics simulations and coarse‐grained normal mode analyses , 2010, Proteins.

[71]  David Baker,et al.  Characterization of the folding energy landscapes of computer generated proteins suggests high folding free energy barriers and cooperativity may be consequences of natural selection. , 2004, Journal of molecular biology.

[72]  Frederic Rousseau,et al.  How evolutionary pressure against protein aggregation shaped chaperone specificity. , 2006, Journal of molecular biology.

[73]  F. Hildebrand,et al.  Evidence of Selection upon Genomic GC-Content in Bacteria , 2010, PLoS genetics.

[74]  C. Micheletti,et al.  Structural and dynamical alignment of enzymes with partial structural similarity , 2007 .

[75]  Laurent Duret,et al.  GC-Content Evolution in Bacterial Genomes: The Biased Gene Conversion Hypothesis Expands , 2014, bioRxiv.

[76]  T. Lenaerts,et al.  Structural insights into the intertwined dimer of fyn SH2 , 2015, Protein science : a publication of the Protein Society.

[77]  Nikolay V Dokholyan,et al.  Natural selection against protein aggregation on self-interacting and essential proteins in yeast, fly, and worm. , 2008, Molecular biology and evolution.

[78]  Z. Luthey-Schulten,et al.  Ab initio protein structure prediction. , 2002, Current opinion in structural biology.

[79]  I. Bahar,et al.  Coarse-grained normal mode analysis in structural biology. , 2005, Current opinion in structural biology.

[80]  Ugo Bastolla,et al.  Detecting Selection on Protein Stability through Statistical Mechanical Models of Folding and Evolution , 2014, Biomolecules.

[81]  E. Steele Somatic hypermutation in V-regions. , 1991 .

[82]  R Samudrala,et al.  Decoys ‘R’ Us: A database of incorrect conformations to improve protein structure prediction , 2000, Protein science : a publication of the Protein Society.

[83]  R. Jernigan,et al.  Proteins with similar architecture exhibit similar large-scale dynamic behavior. , 2000, Biophysical journal.

[84]  A. Bertolotti,et al.  Exposure of Hydrophobic Surfaces Initiates Aggregation of Diverse ALS-Causing Superoxide Dismutase-1 Mutants , 2010, Journal of molecular biology.

[85]  Ilya A Vakser,et al.  Protein-protein docking: from interaction to interactome. , 2014, Biophysical journal.

[86]  L. Serrano,et al.  Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins , 2004, Nature Biotechnology.

[87]  L. Duret,et al.  Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. , 2000, Molecular biology and evolution.

[88]  Markus Porto,et al.  Detecting selection for negative design in proteins through an improved model of the misfolded state , 2013, Proteins.

[89]  Maria Anisimova,et al.  Markov Models of Amino Acid Substitution to Study Proteins with Intrinsically Disordered Regions , 2011, PloS one.

[90]  David Baker,et al.  Algorithm discovery by protein folding game players , 2011, Proceedings of the National Academy of Sciences.

[91]  Julián Echave,et al.  Why are the low-energy protein normal modes evolutionarily conserved? , 2012 .

[92]  W. Kauzmann Some factors in the interpretation of protein denaturation. , 1959, Advances in protein chemistry.

[93]  Ezequiel I. Juritz,et al.  Protein conformational diversity modulates sequence divergence. , 2013, Molecular biology and evolution.