Exploring Protein Dynamics Space: The Dynasome as the Missing Link between Protein Structure and Function

Proteins are usually described and classified according to amino acid sequence, structure or function. Here, we develop a minimally biased scheme to compare and classify proteins according to their internal mobility patterns. This approach is based on the notion that proteins not only fold into recurring structural motifs but might also be carrying out only a limited set of recurring mobility motifs. The complete set of these patterns, which we tentatively call the dynasome, spans a multi-dimensional space with axes, the dynasome descriptors, characterizing different aspects of protein dynamics. The unique dynamic fingerprint of each protein is represented as a vector in the dynasome space. The difference between any two vectors, consequently, gives a reliable measure of the difference between the corresponding protein dynamics. We characterize the properties of the dynasome by comparing the dynamics fingerprints obtained from molecular dynamics simulations of 112 proteins but our approach is, in principle, not restricted to any specific source of data of protein dynamics. We conclude that: 1. the dynasome consists of a continuum of proteins, rather than well separated classes. 2. For the majority of proteins we observe strong correlations between structure and dynamics. 3. Proteins with similar function carry out similar dynamics, which suggests a new method to improve protein function annotation based on protein dynamics.

[1]  C. Anfinsen,et al.  Studies on the reduction and re-formation of protein disulfide bonds. , 1961, The Journal of biological chemistry.

[2]  M. Perutz Stereochemistry of cooperative effects in haemoglobin. , 1970, Nature.

[3]  R. Mazo On the theory of brownian motion , 1973 .

[4]  Brian Everitt,et al.  Cluster analysis , 1974 .

[5]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[6]  H. Berendsen,et al.  Molecular dynamics with coupling to an external bath , 1984 .

[7]  R D Young,et al.  Protein states and proteinquakes. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[8]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[9]  W. L. Jorgensen,et al.  The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. , 1988, Journal of the American Chemical Society.

[10]  R. Zwanzig,et al.  Diffusion in a rough potential. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[11]  J. L. Smith,et al.  Refinement at 1.4 A resolution of a model of erabutoxin b: treatment of ordered solvent and discrete disorder. , 1988, Acta crystallographica. Section A, Foundations of crystallography.

[12]  M Hirose,et al.  The primary structure and structural characteristics of Achromobacter lyticus protease I, a lysine-specific serine protease. , 1993, The Journal of biological chemistry.

[13]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[14]  G Vriend,et al.  WHAT IF: a molecular modeling and drug design program. , 1990, Journal of molecular graphics.

[15]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[16]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[17]  Akio Kitao,et al.  Conformational dynamics of polypeptides and proteins in the dihedral angle space and in the cartesian coordinate space: Normal mode analysis of deca‐alanine , 1991 .

[18]  P. Kollman,et al.  Settle: An analytical version of the SHAKE and RATTLE algorithm for rigid water models , 1992 .

[19]  T. Darden,et al.  Particle mesh Ewald: An N⋅log(N) method for Ewald sums in large systems , 1993 .

[20]  H. Berendsen,et al.  Essential dynamics of proteins , 1993, Proteins.

[21]  C. Ban,et al.  Structure of the recombinant Paramecium tetraurelia calmodulin at 1.68 A resolution. , 1994, Acta crystallographica. Section D, Biological crystallography.

[22]  R. Shapiro,et al.  Crystal structure of bovine angiogenin at 1.5-A resolution. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[23]  H. Eklund,et al.  Crystal structure of thioredoxin-2 from Anabaena. , 1995, Structure.

[24]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[25]  C. Sander,et al.  Positioning hydrogen atoms by optimizing hydrogen‐bond networks in protein structures , 1996, Proteins.

[26]  C. Sander,et al.  The PDBFINDER database: a summary of PDB, DSSP and HSSP information with added value , 1996, Comput. Appl. Biosci..

[27]  C. Sander,et al.  Errors in protein structures , 1996, Nature.

[28]  C Sander,et al.  New structure--novel fold? , 1997, Structure.

[29]  Gapped BLAST and PSI-BLAST: A new , 1997 .

[30]  Berk Hess,et al.  LINCS: A linear constraint solver for molecular simulations , 1997, J. Comput. Chem..

[31]  K. Acharya,et al.  Crystal structure of microbial superantigen staphylococcal enterotoxin B at 1.5 A resolution: implications for superantigen recognition by MHC class II molecules and T-cell receptors. , 1998, Journal of molecular biology.

[32]  J. Thornton,et al.  PQS: a protein quaternary structure file server. , 1998, Trends in biochemical sciences.

[33]  D. T. Jones,et al.  Successful recognition of protein folds using threading methods biased by sequence similarity and predicted secondary structure , 1999, Proteins.

[34]  N. Go,et al.  Investigating protein dynamics in collective coordinate space. , 1999, Current opinion in structural biology.

[35]  A. Valencia,et al.  Practical limits of function prediction , 2000, Proteins.

[36]  M. Gerstein,et al.  Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. , 2000, Journal of molecular biology.

[37]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[38]  B. Hess,et al.  Similarities between principal components of protein dynamics and random diffusion , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[39]  R. Jernigan,et al.  Proteins with similar architecture exhibit similar large-scale dynamic behavior. , 2000, Biophysical journal.

[40]  B. L. de Groot,et al.  Water Permeation Across Biological Membranes: Mechanism and Dynamics of Aquaporin-1 and GlpF , 2001, Science.

[41]  Ronald M. Welch,et al.  Climatic Impact of Tropical Lowland Deforestation on Nearby Montane Cloud Forests , 2001, Science.

[42]  J. Skolnick,et al.  Ab initio protein structure prediction via a combination of threading, lattice folding, clustering, and structure refinement , 2001, Proteins.

[43]  M Karplus,et al.  Small-world view of the amino acids that play a key role in protein folding. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[44]  J. Whisstock,et al.  Prediction of protein function from protein sequence and structure , 2003, Quarterly Reviews of Biophysics.

[45]  James E. Bray,et al.  The CATH database: an extended protein family resource for structural and functional genomics , 2003, Nucleic Acids Res..

[46]  Tim J. P. Hubbard,et al.  SCOP database in 2004: refinements integrate structure and sequence family data , 2004, Nucleic Acids Res..

[47]  Sung-Hou Kim,et al.  Global mapping of the protein structure space and application in structure-based inference of protein function. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Gerrit Groenhof,et al.  GROMACS: Fast, flexible, and free , 2005, J. Comput. Chem..

[49]  Douglas B. Kell,et al.  Computational cluster validation in post-genomic data analysis , 2005, Bioinform..

[50]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2005, Nucleic Acids Res..

[51]  A. Clauset Finding local community structure in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[52]  P. Biggin,et al.  Comparative molecular dynamics—Similar folds and similar motions? , 2005, Proteins.

[53]  A. M. Lisewski,et al.  Rapid detection of similarity in protein structure and function through contact metric distances , 2006, Nucleic acids research.

[54]  Carsten Kutzner,et al.  GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. , 2008, Journal of chemical theory and computation.

[55]  A. Lesk,et al.  Correspondences between low‐energy modes in enzymes: Dynamics‐based alignment of enzymatic functional families , 2008, Protein science : a publication of the Protein Society.

[56]  Dariusz Plewczynski,et al.  3D-Fun: predicting enzyme function from structure , 2008, Nucleic Acids Res..

[57]  David E. Shaw,et al.  Anton: A Specialized Machine for Millisecond-Scale Molecular Dynamics Simulations of Proteins , 2009, 2009 19th IEEE Symposium on Computer Arithmetic.

[58]  M. Sansom,et al.  One membrane protein, two structures and six environments: a comparative molecular dynamics simulation study of the bacterial outer membrane protein PagP , 2009, Molecular membrane biology.

[59]  Nick V Grishin,et al.  Discrete-continuous duality of protein structure space. , 2009, Current opinion in structural biology.

[60]  Martin Zacharias,et al.  Flexibility of the MHC class II peptide binding cleft in the bound, partially filled, and empty states: a molecular dynamics simulation study. , 2009, Biopolymers.

[61]  Johannes Söding,et al.  Fast and accurate automatic structure prediction with HHpred , 2009, Proteins.

[62]  Michal Brylinski,et al.  The continuity of protein structure space is an intrinsic property of proteins , 2009, Proceedings of the National Academy of Sciences.

[63]  Modesto Orozco,et al.  An atomistic view to the gas phase proteome. , 2009, Structure.

[64]  Valerie Daggett,et al.  Dynameomics: a consensus view of the protein unfolding/folding transition state ensemble across a diverse set of protein folds. , 2009, Biophysical journal.

[65]  Angel R. Ortiz,et al.  Cross-Over between Discrete and Continuous Protein Structure Space: Insights into Automatic Classification and Networks of Protein Structures , 2009, PLoS Comput. Biol..

[66]  Liisa Holm,et al.  Advances and pitfalls of protein structural alignment. , 2009, Current opinion in structural biology.

[67]  Jotun Hein,et al.  Dynamics based alignment of proteins: an alternative approach to quantify dynamic similarity , 2010, BMC Bioinformatics.

[68]  Ugo Bastolla,et al.  Quantifying the evolutionary divergence of protein structures: The role of function change and function conservation , 2010, Proteins.

[69]  W R Taylor,et al.  On the evolutionary origins of "Fold Space Continuity": a study of topological convergence and divergence in mixed alpha-beta domains. , 2010, Journal of structural biology.

[70]  A. Sarai,et al.  Analysis of electric moments of RNA-binding proteins: implications for mechanism and prediction , 2011, BMC Structural Biology.

[71]  Modesto Orozco,et al.  MoDEL (Molecular Dynamics Extended Library): a database of atomistic molecular dynamics trajectories. , 2010, Structure.

[72]  Joseph A. Bank,et al.  Supporting Online Material Materials and Methods Figs. S1 to S10 Table S1 References Movies S1 to S3 Atomic-level Characterization of the Structural Dynamics of Proteins , 2022 .

[73]  Rudesh D. Toofanny,et al.  A comprehensive multidimensional-embedded, one-dimensional reaction coordinate for protein unfolding/folding. , 2010, Biophysical journal.

[74]  R Dustin Schaeffer,et al.  Dynameomics: a comprehensive database of protein dynamics. , 2010, Structure.

[75]  Andreas Martin Lisewski,et al.  Protein function prediction: towards integration of similarity metrics. , 2011, Current opinion in structural biology.

[76]  Dexter Kozen,et al.  New , 2020, MFPS.