Coevolutionary information, protein folding landscapes, and the thermodynamics of natural selection

Significance Natural protein sequences, being the result of random mutation coupled with natural selection, have remarkable properties that are not typical of unselected random sequences, including the ability to robustly fold to an organized structure that is needed to function. We estimate the selection temperature, the effective temperature at which sequences were selected by evolution, for eight protein families and compare these values with experimental data for folding temperatures of proteins in each family. The selection temperature measures the importance of maintaining the stability and structural specificity of the folded state on the evolutionary process. For all families, the selection temperature is below physiological temperature, indicating that maintaining the structural integrity of the folded state is an important constraint on evolution. The energy landscape used by nature over evolutionary timescales to select protein sequences is essentially the same as the one that folds these sequences into functioning proteins, sometimes in microseconds. We show that genomic data, physical coarse-grained free energy functions, and family-specific information theoretic models can be combined to give consistent estimates of energy landscape characteristics of natural proteins. One such characteristic is the effective temperature Tsel at which these foldable sequences have been selected in sequence space by evolution. Tsel quantifies the importance of folded-state energetics and structural specificity for molecular evolution. Across all protein families studied, our estimates for Tsel are well below the experimental folding temperatures, indicating that the energy landscapes of natural foldable proteins are strongly funneled toward the native state.

[1]  S. Sainsbury,et al.  Crystallization and preliminary X-ray analysis of CrgA, a LysR-type transcriptional regulator from pathogenic Neisseria meningitidis MC58. , 2008, Acta crystallographica. Section F, Structural biology and crystallization communications.

[2]  C. Sander,et al.  Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.

[3]  P. Wolynes,et al.  Statistical mechanics of a correlated energy landscape model for protein folding funnels , 1996, cond-mat/9606159.

[4]  Thomas A. Hopf,et al.  Protein structure prediction from sequence variation , 2012, Nature Biotechnology.

[5]  V S Pande,et al.  Statistical mechanics of simple models of protein folding and design. , 1997, Biophysical journal.

[6]  Fabrizio Chiti,et al.  Prevention of amyloid‐like aggregation as a driving force of protein evolution , 2007, EMBO reports.

[7]  Geerten W Vuister,et al.  Structure, dynamics and binding characteristics of the second PDZ domain of PTP-BL. , 2002, Journal of Molecular Biology.

[8]  H. Chan,et al.  Polymer principles of protein calorimetric two‐state cooperativity , 2000, Proteins.

[9]  S Subbiah,et al.  Structure of the amino-terminal domain of phage 434 repressor at 2.0 A resolution. , 1989, Journal of molecular biology.

[10]  Vijay S. Pande,et al.  Heteropolymer freezing and design: Towards physical models of protein folding , 2000 .

[11]  S. Sau,et al.  Physicochemical properties and distinct DNA binding capacity of the repressor of temperate Staphylococcus aureus phage φ11 , 2009, The FEBS journal.

[12]  E. Shakhnovich,et al.  Statistical mechanics of proteins with "evolutionary selected" sequences. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[13]  P. Wolynes,et al.  Spin glasses and the statistical mechanics of protein folding. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[14]  L A Mirny,et al.  Universality and diversity of the protein folding scenarios: a comprehensive analysis with the aid of a lattice model. , 1996, Folding & design.

[15]  John Orban,et al.  Peptidoglycan recognition by Pal, an outer membrane lipoprotein. , 2006, Biochemistry.

[16]  J. Onuchic,et al.  Toward an outline of the topography of a realistic protein-folding funnel. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Tanja Kortemme,et al.  Computational Protein Design Quantifies Structural Constraints on Amino Acid Covariation , 2013, PLoS Comput. Biol..

[18]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[19]  David L. Robertson,et al.  An integrated view of molecular coevolution in protein-protein interactions. , 2010, Molecular biology and evolution.

[20]  Cecilia Clementi,et al.  The effects of nonnative interactions on protein folding rates: Theory and simulation , 2004, Protein science : a publication of the Protein Society.

[21]  Masaki Sasai,et al.  Gradual development of protein-like global structures through functional selection , 1999, Nature Structural Biology.

[22]  Jeffery G. Saven,et al.  STATISTICAL MECHANICS OF THE COMBINATORIAL SYNTHESIS AND ANALYSIS OF FOLDING MACROMOLECULES , 1997 .

[23]  FoldingVijay S. PandePhysics,et al.  Heteropolymer Freezing and Design : Towards Physical Models of Protein , 2000 .

[24]  H. Chan,et al.  Temperature dependence of hydrophobic interactions: A mean force perspective , 2000 .

[25]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[26]  J. Bujnicki,et al.  N2-Methylation of Guanosine at Position 10 in tRNA Is Catalyzed by a THUMP Domain-containing, S-Adenosylmethionine-dependent Methyltransferase, Conserved in Archaea and Eukaryota*[boxs] , 2004, Journal of Biological Chemistry.

[27]  A. Musatov,et al.  Unusual effect of salts on the homodimeric structure of NADH oxidase from Thermus thermophilus in acidic pH. , 2006, Biochimica et Biophysica Acta.

[28]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[29]  E. Shakhnovich,et al.  A new approach to the design of stable proteins. , 1993, Protein engineering.

[30]  S. d'Auria,et al.  Binding of glutamine to glutamine‐binding protein from Escherichia coli induces changes in protein structure and increases protein stability , 2004, Proteins.

[31]  Z. Weng,et al.  Structure, function, and evolution of transient and obligate protein-protein interactions. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[32]  J. Onuchic,et al.  Theory of Protein Folding This Review Comes from a Themed Issue on Folding and Binding Edited Basic Concepts Perfect Funnel Landscapes and Common Features of Folding Mechanisms , 2022 .

[33]  F. Morcos,et al.  Genomics-aided structure prediction , 2012, Proceedings of the National Academy of Sciences.

[34]  E. Bornberg-Bauer,et al.  Modeling evolutionary landscapes: mutational stability, topology, and superfunnels in sequence space. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[35]  B. Golinelli‐Pimpaneau,et al.  Insights into the hyperthermostability and unusual region-specificity of archaeal Pyrococcus abyssi tRNA m1A57/58 methyltransferase , 2010, Nucleic acids research.

[36]  M. Sica,et al.  Equilibrium unfolding of the PDZ domain of β2-syntrophin. , 2012, Biophysical journal.

[37]  Hue Sun Chan,et al.  Cooperativity, local-nonlocal coupling, and nonnative interactions: principles of protein folding from coarse-grained models. , 2011, Annual review of physical chemistry.

[38]  A. Finkelstein,et al.  Why do protein architectures have boltzmann‐like statistics? , 1995, Proteins.

[39]  J Weigelt,et al.  NMR structure of the N-terminal domain of E. coli DnaB helicase: implications for structure rearrangements in the helicase hexamer. , 1999, Structure.

[40]  Nicholas P. Schafer,et al.  AWSEM-MD: protein structure prediction using coarse-grained physical potentials and bioinformatically based local structure biasing. , 2012, The journal of physical chemistry. B.

[41]  Eugene I Shakhnovich,et al.  Understanding protein evolution: from protein physics to Darwinian selection. , 2008, Annual review of physical chemistry.

[42]  Nicholas E. Dixon,et al.  In Vivo Protein Cyclization Promoted by a Circularly Permuted Synechocystis sp. PCC6803 DnaB Mini-intein* , 2002, The Journal of Biological Chemistry.

[43]  A. R. Fresht Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding , 1999 .

[44]  J. Onuchic,et al.  Navigating the folding routes , 1995, Science.

[45]  Wolynes,et al.  Correlated energy landscape model for finite, random heteropolymers. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[46]  Ron Elber,et al.  Computational analysis of sequence selection mechanisms. , 2004, Structure.

[47]  Zaida Luthey-Schulten,et al.  Helix-Coil, Liquid Crystal, and Spin Glass Transitions of a Collapsed Heteropolymer , 1995 .

[48]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .