Using Dimensionality Reduction to Better Capture RNA and Protein Folding Motions

Molecular motions, including both protein and RNA, play an essential role in many biochemical processes. Simulations have attempted to study these detailed large-scale molecular motions, but they are often limited by the expense of representing complex molecular structures. For example, enumerating all possible RNA conformations with valid contacts is an exponential endeavor, and the complexity of protein motion increases with the model’s detail and protein length. In this paper, we explore the use of dimensionality reduction techniques to better approximate protein and RNA motions. We present two new methods to study motions: (1) an evaluation technique to compare different distributions of conformations and (2) a way to identify likely local motion transitions. We combine these two methods in an existing motion framework to study large-scale motions for both proteins and RNA. We show that dimensionality reduction can be effectively applied, even to discrete conformation spaces (as for RNA secondary structure) that do not typically lend themselves to reduction techniques.

[1]  Lydia Tapia,et al.  Simulating Protein Motions with Rigidity Analysis , 2007, J. Comput. Biol..

[2]  M Levitt,et al.  Real-time interactive frequency filtering of molecular dynamics trajectories. , 1991, Journal of molecular biology.

[3]  Lydia Tapia,et al.  Tools for Simulating and Analyzing RNA Folding Kinetics , 2007, RECOMB.

[4]  Peter F. Stadler,et al.  Density of States, Metastable States, and Saddle Points: Exploring the Energy Landscape of an RNA Molecule , 1997, ISMB.

[5]  M. Karplus,et al.  Locally accessible conformations of proteins: Multiple molecular dynamics simulations of crambin , 1998, Protein science : a publication of the Protein Society.

[6]  Michael Zuker,et al.  Algorithms and Thermodynamics for RNA Secondary Structure Prediction: A Practical Guide , 1999 .

[7]  D. Sorensen,et al.  Automatic identification of discrete substates in proteins: Singular value decomposition analysis of time‐averaged crystallographic refinements , 1995, Proteins.

[8]  A Kitao,et al.  Harmonic and anharmonic aspects in the dynamics of BPTI: A normal mode analysis and principal component analysis , 1994, Protein science : a publication of the Protein Society.

[9]  D. Sankoff,et al.  RNA secondary structures and their prediction , 1984 .

[10]  B. Hendrickson,et al.  Regular ArticleAn Algorithm for Two-Dimensional Rigidity Percolation: The Pebble Game , 1997 .

[11]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[12]  P. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 1999 .

[13]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[14]  E. Bright Wilson,et al.  Book Reviews: Molecular Vibrations. The Theory of Infrared and Raman Vibrational Spectra , 1955 .

[15]  Jacobs,et al.  Generic rigidity percolation: The pebble game. , 1995, Physical review letters.

[16]  Jacobs,et al.  Generic rigidity percolation in two dimensions. , 1996, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[17]  C. Allende Prieto,et al.  Estimation of stellar atmospheric parameters from SDSS/SEGUE spectra , 2007, astro-ph/0703309.

[18]  M. Levitt Protein folding by restrained energy minimization and molecular dynamics. , 1983, Journal of molecular biology.

[19]  Naoki Saito,et al.  Automated discrimination of shapes in high dimensions , 2007, SPIE Optical Engineering + Applications.

[20]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[21]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[22]  Lydia E Kavraki,et al.  Low-dimensional, free-energy landscapes of protein-folding reactions by nonlinear dimensionality reduction , 2006, Proc. Natl. Acad. Sci. USA.

[23]  Nancy M. Amato,et al.  Using Motion Planning to Study RNA Folding Kinetics , 2005, J. Comput. Biol..

[24]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[25]  B. Hendrickson,et al.  An Algorithm for Two-Dimensional Rigidity Percolation , 1997 .

[26]  I. Hofacker RNA Secondary Structures: A Tractable Model of Biopolymer Folding , 1998 .

[27]  H J Berendsen,et al.  Toward an exhaustive sampling of the configurational spaces of the two forms of the peptide hormone guanylin. , 1996, Journal of biomolecular structure & dynamics.

[28]  K. Dill,et al.  From Levinthal to pathways to funnels , 1997, Nature Structural Biology.

[29]  N. Go,et al.  Investigating protein dynamics in collective coordinate space. , 1999, Current opinion in structural biology.

[30]  B. Faverjon,et al.  Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .

[31]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[32]  M. Michael Gromiha,et al.  Protein Structure Prediction , 2010 .

[33]  G. Steger,et al.  RNA structure and the regulation of gene expression , 1996, Plant Molecular Biology.

[34]  Donald J. Jacobs,et al.  Generic rigidity in three-dimensional bond-bending networks , 1998 .

[35]  Shawna L. Thomas,et al.  Simulating RNA folding kinetics on approximated energy landscapes. , 2008, Journal of molecular biology.

[36]  Lydia E. Kavraki,et al.  A dimensionality reduction approach to modeling protein flexibility , 2002, RECOMB '02.

[37]  David Baker,et al.  Computer-based redesign of a protein folding pathway , 2001, Nature Structural Biology.

[38]  Nancy M. Amato,et al.  A Path Planning-Based Study of Protein Folding with a Case Study of Hairpin Formation in Protein G and L , 2002, Pacific Symposium on Biocomputing.

[39]  M. Davison Introduction to Multidimensional Scaling and Its Applications , 1983 .

[40]  N. Amato,et al.  A motion planning approach to protein folding , 2003 .

[41]  H. Berendsen,et al.  An extended sampling of the configurational space of HPr from E. coli , 1996, Proteins.

[42]  J. Jung,et al.  Protein structure prediction. , 2001, Current opinion in chemical biology.

[43]  I. Tinoco,et al.  How RNA folds. , 1999, Journal of molecular biology.

[44]  J. Douglas Carroll,et al.  14 Multidimensional scaling and its applications , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[45]  David Sankoff,et al.  RNA secondary structures and their prediction , 1984 .

[46]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[47]  Marcus B. Kubitzki,et al.  Temperature enhanced essential dynamics replica EXchange (TEE-REX) - an efficient method for biomolecular simulations. , 2007 .

[48]  F. J. Sevilla,et al.  Low-dimensional BEC , 2000 .

[49]  Donald F. Hornig,et al.  Molecular Vibrations. The Theory of Infrared and Raman Vibrational Spectra. , 1956 .

[50]  R. Li,et al.  The hydrogen exchange core and protein folding , 1999, Protein science : a publication of the Protein Society.

[51]  Lydia E Kavraki,et al.  Fast and reliable analysis of molecular motion using proximity relations and dimensionality reduction , 2007, Proteins.

[52]  H J Berendsen,et al.  An efficient method for sampling the essential subspace of proteins. , 1996, Journal of biomolecular structure & dynamics.

[53]  Lydia Tapia,et al.  Kinetics analysis methods for approximate folding landscapes , 2007, ISMB/ECCB.

[54]  Lydia E. Kavraki,et al.  Probabilistic roadmaps for path planning in high-dimensional configuration spaces , 1996, IEEE Trans. Robotics Autom..

[55]  Audrey Lee-St. John,et al.  Pebble game algorithms and sparse graphs , 2007, Discret. Math..

[56]  Laveen N. Kanal,et al.  Classification, Pattern Recognition and Reduction of Dimensionality , 1982, Handbook of Statistics.

[57]  Miguel L. Teodoro Molecular conformational sampling using collective coordinate expansive spaces , 2004 .

[58]  Nancy M. Amato,et al.  Using Motion Planning to Map Protein Folding Landscapes and Analyze Folding Kinetics of Known Native Structures , 2003, J. Comput. Biol..

[59]  Ronald M. Levy,et al.  Vibrational approach to the dynamics of an α‐helix , 1979 .

[60]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[61]  Lydia E. Kavraki,et al.  Understanding Protein Flexibility through Dimensionality Reduction , 2003, J. Comput. Biol..

[62]  García,et al.  Large-amplitude nonlinear motions in proteins. , 1992, Physical review letters.

[63]  H. Berendsen,et al.  Model‐free methods of analyzing domain motions in proteins from simulation: A comparison of normal mode analysis and molecular dynamics simulation of lysozyme , 1997, Proteins.

[64]  Steven A. Siegelbaum,et al.  Effects of Surface Water on Protein Dynamics Studied by a Novel Coarse-Grained Normal Mode Approach , 2008, Biophysical journal.

[65]  Martin Billeter,et al.  Essential domain motions in barnase revealed by MD simulations , 2002, Proteins.

[66]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.