Patterns of coevolving amino acids unveil structural and dynamical domains

Significance Patterns of pairwise correlations in sequence alignments can be used to reconstruct the network of residue-residue contacts and thus the three-dimensional structure of proteins. Less explored, and yet extremely intriguing, is the functional relevance of such coevolving networks: Do they encode for the collective motions occurring in proteins at thermal equilibrium? Here, by combining coevolutionary coupling analysis with a state-of-the-art dimensionality reduction approach, we show that the network of pairwise evolutionary couplings can be analyzed to reveal communities of amino acids, which we term “evolutionary domains,” that are in striking agreement with the quasi-rigid protein domains obtained from elastic network models and molecular dynamics simulations. Patterns of interacting amino acids are so preserved within protein families that the sole analysis of evolutionary comutations can identify pairs of contacting residues. It is also known that evolution conserves functional dynamics, i.e., the concerted motion or displacement of large protein regions or domains. Is it, therefore, possible to use a pure sequence-based analysis to identify these dynamical domains? To address this question, we introduce here a general coevolutionary coupling analysis strategy and apply it to a curated sequence database of hundreds of protein families. For most families, the sequence-based method partitions amino acids into a few clusters. When viewed in the context of the native structure, these clusters have the signature characteristics of viable protein domains: They are spatially separated but individually compact. They have a direct functional bearing too, as shown for various reference cases. We conclude that even large-scale structural and functionally related properties can be recovered from inference methods applied to evolutionary-related sequences. The method introduced here is available as a software package and web server (spectrus.sissa.it/spectrus-evo_webserver).

[1]  E. Campbell,et al.  Atomic structure of a voltage-dependent K+ channel in a lipid membrane-like environment , 2007, Nature.

[2]  Martin Weigt,et al.  Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1 , 2015 .

[3]  D. Hazuda,et al.  Xenopus transcription factor A requires zinc for binding to the 5 S RNA gene. , 1983, The Journal of biological chemistry.

[4]  E. Shakhnovich,et al.  Engineering of stable and fast-folding sequences of model proteins. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[6]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[7]  D. Julius,et al.  TRPV1 structures in nanodiscs reveal mechanisms of ligand and lipid action , 2016, Nature.

[8]  Gregory A.Petsko and Dagmar Ringe Protein structure and function , 2003 .

[9]  Qin Feng Temperature sensing by thermal TRP channels: thermodynamic basis and molecular insights. , 2014, Current topics in membranes.

[10]  Simona Cocco,et al.  From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction , 2012, PLoS Comput. Biol..

[11]  Wei Li,et al.  A Dynamic Knockout Reveals That Conformational Fluctuations Influence the Chemical Step of Enzyme Catalysis , 2011, Science.

[12]  D. Julius,et al.  Structure of the TRPV1 ion channel determined by electron cryo-microscopy , 2013, Nature.

[13]  Marcin J. Skwark,et al.  Improved Contact Predictions Using the Recognition of Protein Like Contact Patterns , 2014, PLoS Comput. Biol..

[14]  A. Ramanathan,et al.  Evolutionarily Conserved Linkage between Enzyme Fold, Flexibility, and Catalysis , 2011, PLoS biology.

[15]  T. Hwa,et al.  Identification of direct residue contacts in protein–protein interaction by message passing , 2009, Proceedings of the National Academy of Sciences.

[16]  A. Laio,et al.  Characterization of the free-energy landscapes of proteins by NMR-guided metadynamics , 2013, Proceedings of the National Academy of Sciences.

[17]  A. Lesk,et al.  Correspondences between low‐energy modes in enzymes: Dynamics‐based alignment of enzymatic functional families , 2008, Protein science : a publication of the Protein Society.

[18]  Yang Zhang,et al.  I-TASSER: a unified platform for automated protein structure and function prediction , 2010, Nature Protocols.

[19]  Stanislas Leibler,et al.  An interdomain sector mediating allostery in Hsp70 molecular chaperones , 2010, Molecular systems biology.

[20]  E. Campbell,et al.  Voltage Sensor of Kv1.2: Structural Basis of Electromechanical Coupling , 2005, Science.

[21]  D. Case,et al.  Induced fit and "lock and key" recognition of 5S RNA by zinc fingers of transcription factor IIIA. , 2006, Journal of molecular biology.

[22]  D. Clapham,et al.  An introduction to TRP channels. , 2006, Annual review of physiology.

[23]  D. Julius,et al.  TRPV1 structures in distinct conformations reveal mechanisms of activation , 2013, Nature.

[24]  X. Xie,et al.  Two-dimensional reaction free energy surfaces of catalytic reaction: effects of protein conformational dynamics on enzyme catalysis. , 2008, The journal of physical chemistry. B.

[25]  Davide Provasi,et al.  Ligand-Induced Modulation of the Free-Energy Landscape of G Protein-Coupled Receptors Explored by Adaptive Biasing Techniques , 2011, PLoS Comput. Biol..

[26]  A. Jean-Marie,et al.  A model-based approach for detecting coevolving positions in a molecule. , 2005, Molecular biology and evolution.

[27]  Lucy J. Colwell,et al.  The interface of protein structure, protein biophysics, and molecular evolution , 2012, Protein science : a publication of the Protein Society.

[28]  M. Klein,et al.  Evolutionary imprint of activation: The design principles of VSDs , 2014, The Journal of general physiology.

[29]  C. Sander,et al.  Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.

[30]  M. Karplus,et al.  Evaluation of comparative protein modeling by MODELLER , 1995, Proteins.

[31]  Giorgio Colombo,et al.  Modeling Signal Propagation Mechanisms and Ligand-Based Conformational Dynamics of the Hsp90 Molecular Chaperone Full-Length Dimer , 2009, PLoS Comput. Biol..

[32]  Marcin J. Skwark,et al.  Improving Contact Prediction along Three Dimensions , 2014, PLoS Comput. Biol..

[33]  Mikko Kivelä,et al.  Generalizations of the clustering coefficient to weighted complex networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  P. Agarwal,et al.  Network of coupled promoting motions in enzyme catalysis , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[35]  A. Biegert,et al.  HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment , 2011, Nature Methods.

[36]  A. Klug,et al.  Invariance of the zinc finger module: A comparison of the free structure with those in nucleic‐acid complexes , 2007, Proteins.

[37]  Vincenzo Carnevale,et al.  Understanding TRPV1 activation by ligands: Insights from the binding modes of capsaicin and resiniferatoxin , 2015, Proceedings of the National Academy of Sciences.

[38]  Lubert Stryer,et al.  Protein structure and function , 2005, Experientia.

[39]  A. Maritan,et al.  Accurate and efficient description of protein vibrational dynamics: Comparing molecular dynamics and Gaussian models , 2004, Proteins.

[40]  E. Shakhnovich,et al.  Understanding hierarchical protein evolution from first principles. , 2001, Journal of molecular biology.

[41]  C. Domene,et al.  Binding of Capsaicin to the TRPV1 Ion Channel. , 2015, Molecular pharmaceutics.

[42]  K. Woods,et al.  Using THz Spectroscopy, Evolutionary Network Analysis Methods, and MD Simulation to Map the Evolution of Allosteric Communication Pathways in c-Type Lysozymes , 2015, Molecular biology and evolution.

[43]  C. Venien-Bryan,et al.  Structure of a KirBac potassium channel with an open bundle-crossing indicates a mechanism of channel gating , 2011, Nature Structural &Molecular Biology.

[44]  A. Valencia,et al.  From residue coevolution to protein conformational ensembles and functional dynamics , 2015, Proceedings of the National Academy of Sciences.

[45]  V. Carnevale,et al.  Voltage-Gated Sodium Channels: Evolutionary History and Distinctive Sequence Features. , 2016, Current topics in membranes.

[46]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[47]  Sudhir Kumar,et al.  Structural Dynamics Flexibility Informs Function and Evolution at a Proteome Scale , 2013 .

[48]  Thierry Mora,et al.  Capturing coevolutionary signals inrepeat proteins , 2014, BMC Bioinformatics.

[49]  Olivier Rivoire Elements of coevolution in biological sequences. , 2013, Physical review letters.

[50]  S. Benkovic,et al.  Relating protein motion to catalysis. , 2006, Annual review of biochemistry.

[51]  J. Ruppersberg Ion Channels in Excitable Membranes , 1996 .

[52]  Jan Kubelka,et al.  A Phylogenetic Analysis of Normal Modes Evolution in Enzymes and its Relationship to Enzyme Function , 2012 .

[53]  R. Nussinov,et al.  The origin of allosteric functional modulation: multiple pre-existing pathways. , 2009, Structure.

[54]  K. Teilum,et al.  Functional aspects of protein flexibility , 2009, Cellular and Molecular Life Sciences.

[55]  W. Catterall,et al.  The VGL-Chanome: A Protein Superfamily Specialized for Electrical Signaling and Ionic Homeostasis , 2004, Science's STKE.

[56]  C. Sander,et al.  All-atom 3D structure prediction of transmembrane β-barrel proteins from sequences , 2015, Proceedings of the National Academy of Sciences.

[57]  Dan S. Tawfik,et al.  Conformational sampling, catalysis, and evolution of the bacterial phosphotriesterase , 2009, Proceedings of the National Academy of Sciences.

[58]  R. Ranganathan,et al.  Evolutionarily conserved pathways of energetic connectivity in protein families. , 1999, Science.

[59]  E. Aurell,et al.  Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[60]  D. Thirumalai,et al.  Allosteric transitions in the chaperonin GroEL are captured by a dominant normal mode that is most robust to sequence variations. , 2007, Biophysical journal.

[61]  Wei Cheng,et al.  Structural mechanism underlying capsaicin binding and activation of TRPV1 ion channel , 2015, Nature chemical biology.

[62]  Najeeb M. Halabi,et al.  Protein Sectors: Evolutionary Units of Three-Dimensional Structure , 2009, Cell.

[63]  V. Carnevale,et al.  TRPV1: A Target for Rational Drug Design , 2016, Pharmaceuticals.

[64]  M. Newman Random Graphs as Models of Networks , 2002, cond-mat/0202208.

[65]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[66]  Wei Min,et al.  Role of conformational dynamics in kinetics of an enzymatic cycle in a nonequilibrium steady state. , 2009, The Journal of chemical physics.

[67]  D. A. Bosco,et al.  Enzyme Dynamics During Catalysis , 2002, Science.

[68]  Z. Nevin Gerek,et al.  Collective Dynamics Differentiates Functional Divergence in Protein Evolution , 2012, PLoS Comput. Biol..

[69]  K. Hinsen,et al.  Evaluation of Protein Elastic Network Models Based on an Analysis of Collective Motions. , 2013, Journal of chemical theory and computation.

[70]  Joseph A. Bank,et al.  Supporting Online Material Materials and Methods Figs. S1 to S10 Table S1 References Movies S1 to S3 Atomic-level Characterization of the Structural Dynamics of Proteins , 2022 .

[71]  C. Micheletti,et al.  Structural and dynamical alignment of enzymes with partial structural similarity , 2007 .

[72]  Zhe Lu,et al.  Coupling between Voltage Sensors and Activation Gate in Voltage-gated K+ Channels , 2002, The Journal of general physiology.

[73]  B. Nilius,et al.  Sensing with TRP channels , 2005, Nature chemical biology.

[74]  M. Klein,et al.  Comparative sequence analysis suggests a conserved gating mechanism for TRP channels , 2015, The Journal of general physiology.

[75]  Aleksey A. Porollo,et al.  CoeViz: a web-based tool for coevolution analysis of protein residues , 2016, BMC Bioinformatics.

[76]  B. Hille,et al.  Ionic channels of excitable membranes , 2001 .

[77]  C. Micheletti,et al.  Convergent dynamics in the protease enzymatic superfamily. , 2006, Journal of the American Chemical Society.

[78]  R. Aldrich,et al.  Influence of conservation on calculations of amino acid covariance in multiple sequence alignments , 2004, Proteins.

[79]  Stanislas Leibler,et al.  Protein Sectors: Statistical Coupling Analysis versus Conservation , 2014, PLoS Comput. Biol..

[80]  C. Micheletti Comparing proteins by their internal dynamics: exploring structure-function relationships beyond static structural alignments. , 2012, Physics of life reviews.

[81]  Lucy J. Colwell,et al.  Predicting Functionally Informative Mutations in Escherichia coli BamA Using Evolutionary Covariance Analysis , 2013, Genetics.

[82]  G. Stormo,et al.  Correlated mutations in protein sequences: Phylogenetic and structural effects , 1997 .

[83]  Luca Ponzoni,et al.  SPECTRUS: A Dimensionality Reduction Approach for Identifying Dynamical Domains in Protein Complexes from Limited Structural Datasets. , 2015, Structure.

[84]  I. Bahar,et al.  Sequence Evolution Correlates with Structural Dynamics , 2012, Molecular biology and evolution.

[85]  W. Catterall,et al.  THE CRYSTAL STRUCTURE OF A VOLTAGE-GATED SODIUM CHANNEL , 2011, Nature.

[86]  Amos Maritan,et al.  Elastic properties of proteins: insight on the folding process and evolutionary selection of native structures. , 2002, Journal of molecular biology.

[87]  R. Levy,et al.  Structural propensities of kinase family proteins from a Potts model of residue co‐variation , 2016, Protein science : a publication of the Protein Society.

[88]  D. Baker,et al.  Assessing the utility of coevolution-based residue–residue contact predictions in a sequence- and structure-rich era , 2013, Proceedings of the National Academy of Sciences.

[89]  L. Kay,et al.  Intrinsic dynamics of an enzyme underlies catalysis , 2005, Nature.

[90]  F. Elinder,et al.  Molecular Movement of the Voltage Sensor in a K Channel , 2003, The Journal of general physiology.

[91]  R. Nussinov,et al.  Comparing interfacial dynamics in protein-protein complexes: an elastic network approach , 2010, BMC Structural Biology.

[92]  Sergei L. Kosakovsky Pond,et al.  Phylogenetic analysis of population-based and deep sequencing data to identify coevolving sites in the nef gene of HIV-1. , 2010, Molecular biology and evolution.