Stochastic roadmap simulation: an efficient representation and algorithm for analyzing molecular motion

Classic techniques for simulating molecular motion, such as the Monte Carlo and molecular dynamics methods, generate individual motion pathways one at a time and spend most of their time trying to escape from the local minima of the energy landscape of a molecule. Their high computational cost prevents them from being used to analyze many pathways. We introduce Stochustic Roadmap Sirrrcllation (SRS), a new approach for exploring the kinetics of molecular motion by simultaneously examining multiple pathways encoded compactly in a graph, called a roadmap. A roadmap is computed by sampling a molecule's conformation space at random. The computation does not suffer from the localminima problem encountered with existing methods. Each path in the roadmap represents a potential motion pathway and is associated with a probability indicating the likelihood that the molecule follows this pathway. By viewing the roadmap as a Markov chain, we can efficiently compute kinetic properties of molecular motion over the entire molecular energy landscape. We also prove that, in the limit, SRS converges to the same distribution as Monte Carlo simulation. To test the effectiveness of our approach, we apply it to the computation of the transmission coefficients for protein folding, an important order parameter that measures the "kinetic distance" of a protein's conformation to its native state Our computational studies show that SRS obtains more accurate results and achieves several orders- of- magnitude reduction in computation time, compared with Monte Carlo simulatio.

[1]  R. Dreisbach,et al.  STANFORD UNIVERSITY. , 1914, Science.

[2]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[3]  W. Hoeffding Probability inequalities for sum of bounded random variables , 1963 .

[4]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[5]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[6]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[7]  O. Sugita,et al.  [Lactate dehydrogenase]. , 1984, Rinsho byori. The Japanese journal of clinical pathology.

[8]  H. M. Taylor,et al.  An introduction to stochastic modeling , 1985 .

[9]  K. Hart,et al.  The rates of defined changes in protein structure during the catalytic cycle of lactate dehydrogenase. , 1985, Biochimica et biophysica acta.

[10]  K. Dill Theory for the folding and stability of globular proteins. , 1985, Biochemistry.

[11]  W. Chia,et al.  Site-directed mutagenesis reveals role of mobile arginine residue in lactate dehydrogenase catalysis , 1986, Nature.

[12]  Francis L. Merat,et al.  Introduction to robotics: Mechanics and control , 1987, IEEE J. Robotics Autom..

[13]  K. Hart,et al.  A strong carboxylate-arginine interaction is important in substrate orientation and recognition in lactate dehydrogenase. , 1987, Biochimica et biophysica acta.

[14]  H. Muirhead,et al.  A specific, highly active malate dehydrogenase by redesign of a lactate dehydrogenase framework. , 1988, Science.

[15]  H. Wilks,et al.  An investigation of the contribution made by the carboxylate group of an active site histidine-aspartate couple to binding and catalysis in lactate dehydrogenase. , 1988, Biochemistry.

[16]  Bruce E. Hajek,et al.  Cooling Schedules for Optimal Annealing , 1988, Math. Oper. Res..

[17]  S. Kearsley On the orthogonal transformation used for structural comparisons , 1989 .

[18]  K. Hart An investigation into the molecular basis of substrate specificity in lactate dehydrogenase. , 1989 .

[19]  Alan George,et al.  The Evolution of the Minimum Degree Ordering Algorithm , 1989, SIAM Rev..

[20]  V. Rich Personal communication , 1989, Nature.

[21]  K. Sharp,et al.  Electrostatic interactions in macromolecules: theory and applications. , 1990, Annual review of biophysics and biophysical chemistry.

[22]  R. Elber,et al.  Self‐avoiding walk between two fixed points as a tool to calculate reaction paths in large molecular systems , 1990 .

[23]  H. Muirhead,et al.  Design and synthesis of new enzymes based on the lactate dehydrogenase framework. , 1991, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[24]  J. M. Haile,et al.  Molecular dynamics simulation : elementary methods / J.M. Haile , 1992 .

[25]  John R. Gilbert,et al.  Sparse Matrices in MATLAB: Design and Implementation , 1992, SIAM J. Matrix Anal. Appl..

[26]  H. Aslaksen National university of singapore. , 1995, Environmental science & technology.

[27]  J. Onuchic,et al.  Funnels, pathways, and the energy landscape of protein folding: A synthesis , 1994, Proteins.

[28]  M. Clamp,et al.  Lattice models of protein folding. , 1995, Biochemical Society transactions.

[29]  A. Fersht,et al.  The structure of the transition state for folding of chymotrypsin inhibitor 2 analysed by protein engineering methods: evidence for a nucleation-condensation mechanism for protein folding. , 1995, Journal of molecular biology.

[30]  K. Dill,et al.  A simple protein folding algorithm using a binary code and secondary structure constraints. , 1995, Protein engineering.

[31]  A. Leach Molecular Modelling: Principles and Applications , 1996 .

[32]  J. Skolnick,et al.  Lattice Models of Protein Folding, Dynamics and Thermodynamics , 1996 .

[33]  Lydia E. Kavraki,et al.  Probabilistic roadmaps for path planning in high-dimensional configuration spaces , 1996, IEEE Trans. Robotics Autom..

[34]  B. Faverjon,et al.  Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .

[35]  Roland L. Dunbrack,et al.  Bayesian statistical analysis of protein side‐chain rotamer preferences , 1997, Protein science : a publication of the Protein Society.

[36]  K. Dill,et al.  From Levinthal to pathways to funnels , 1997, Nature Structural Biology.

[37]  V. Muñoz,et al.  Folding dynamics and mechanism of β-hairpin formation , 1997, Nature.

[38]  Douglas L. Brutlag,et al.  Hierarchical Protein Structure Superposition Using Both Secondary Structure and Atomic Representations , 1997, ISMB.

[39]  David S. Goodsell,et al.  Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function , 1998 .

[40]  J. Banavar,et al.  Master Equation Approach to Protein Folding and Kinetic Traps , 1998, cond-mat/9803019.

[41]  A. Fersht Structure and mechanism in protein science , 1998 .

[42]  S. Crawford,et al.  Volume 1 , 2012, Journal of Diabetes Investigation.

[43]  David S. Goodsell,et al.  Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function , 1998, J. Comput. Chem..

[44]  V. Pande,et al.  On the transition coordinate for protein folding , 1998 .

[45]  V. Pande,et al.  Pathways for protein folding: is a new view needed? , 1998, Current opinion in structural biology.

[46]  Mark H. Overmars,et al.  The Gaussian sampling strategy for probabilistic roadmap planners , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[47]  A. Finkelstein,et al.  A theoretical search for folding/unfolding nuclei in three-dimensional protein structures. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[49]  Nancy M. Amato,et al.  Probabilistic roadmap methods are embarrassingly parallel , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[50]  G. Henkelman,et al.  A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives , 1999 .

[51]  I. Kuntz,et al.  Flexible ligand docking: A multistep strategy approach , 1999, Proteins.

[52]  M. Karplus,et al.  Interpreting the folding kinetics of helical proteins , 1999, Nature.

[53]  M. Karplus,et al.  Folding of a model three-helix bundle protein: a thermodynamic and kinetic analysis. , 1999, Journal of molecular biology.

[54]  Jean-Claude Latombe,et al.  A Motion Planning Approach to Flexible Ligand Binding , 1999, ISMB.

[55]  S. Takada,et al.  Go-ing for the prediction of protein folding mechanisms. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[56]  M Karplus,et al.  The fundamentals of protein folding: bringing together theory and experiment. , 1999, Current opinion in structural biology.

[57]  V. Muñoz,et al.  A simple model for calculating the kinetics of protein folding from three-dimensional structures. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[58]  E. Alm,et al.  Prediction of protein-folding mechanisms from free-energy landscapes derived from native structures. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[59]  Vijay S. Pande,et al.  Screen Savers of the World Unite! , 2000, Science.

[60]  Nancy M. Amato,et al.  Ligand Binding with OBPRM and Haptic User Input: Enhancing Automatic Motion Planning with Virtual Touch , 2000 .

[61]  D. Case,et al.  Theory and applications of the generalized born solvation model in macromolecular simulations , 2000, Biopolymers.

[62]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[63]  P. Kollman,et al.  Investigating the binding specificity of U1A-RNA by computational mutagenesis. , 2000, Journal of molecular biology.

[64]  J. Richardson,et al.  The penultimate rotamer library , 2000, Proteins.

[65]  Jean-Claude Latombe,et al.  Capturing molecular energy landscapes with probabilistic conformational roadmaps , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[66]  Nancy M. Amato,et al.  Using motion planning to study protein folding pathways , 2001, J. Comput. Biol..

[67]  K. Sanbonmatsu,et al.  Exploring the energy landscape of a β hairpin in explicit solvent , 2001 .

[68]  E. Shakhnovich,et al.  The folding thermodynamics and kinetics of crambin using an all-atom Monte Carlo simulation. , 2000, Journal of molecular biology.

[69]  S. Vajda,et al.  Protein docking along smooth association pathways , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[70]  A. Finkelstein,et al.  Theoretical study of a landscape of protein folding-unfolding pathways. Folding rates at midtransition. , 2001, Biochemistry.

[71]  Ajay K. Royyuru,et al.  Blue Gene: A vision for protein science using a petaflop supercomputer , 2001, IBM Syst. J..

[72]  Lydia E. Kavraki,et al.  A dimensionality reduction approach to modeling protein flexibility , 2002, RECOMB '02.

[73]  H. Scheraga,et al.  An atomically detailed study of the folding pathways of protein A with the stochastic difference equation , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[74]  Jean-Claude Latombe,et al.  Stochastic roadmap simulation for the study of ligand-protein interactions , 2002, ECCB.

[75]  Nancy M. Amato,et al.  Using motion planning to map protein folding landscapes and analyze folding kinetics of known native structures , 2002, RECOMB '02.

[76]  G. Chirikjian,et al.  Elastic models of conformational transitions in macromolecules. , 2002, Journal of molecular graphics & modelling.

[77]  G. Henkelman,et al.  Methods for Finding Saddle Points and Minimum Energy Paths , 2002 .

[78]  Jean-Claude Latombe,et al.  Stochastic Conformational Roadmaps for Computing Ensemble Properties of Molecular Motion , 2002, WAFR.

[79]  Jean-Claude Latombe,et al.  Randomized Kinodynamic Motion Planning with Moving Obstacles , 2002, Int. J. Robotics Res..

[80]  William H. Press,et al.  Numerical recipes in C , 2002 .

[81]  Herbert Edelsbrunner,et al.  Topological Persistence and Simplification , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[82]  T. Schlick Molecular modeling and simulation , 2002 .

[83]  Jean-Claude Latombe,et al.  Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion , 2003, J. Comput. Biol..

[84]  Itay Lotan,et al.  Approximation of protein structure for fast similarity measures , 2003, RECOMB '03.

[85]  K. Dill,et al.  Folding rates and low-entropy-loss routes of two-state proteins. , 2003, Journal of molecular biology.

[86]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[87]  Amit Singh,et al.  Computational Models For Protein Structure Analysis And Protein-Ligand Binding , 2003 .

[88]  Nancy M. Amato,et al.  A Path Planning-Based Study of Protein Folding with a Case Study of Hairpin Formation in Protein G and L , 2002, Pacific Symposium on Biocomputing.

[89]  D. Wales,et al.  A doubly nudged elastic band method for finding transition states. , 2004, The Journal of chemical physics.

[90]  A. Finkelstein,et al.  Outlining folding nuclei in globular proteins. , 2004, Journal of molecular biology.

[91]  Vijay S Pande,et al.  Using path sampling to build better Markovian state models: predicting the folding rate and mechanism of a tryptophan zipper beta hairpin. , 2004, The Journal of chemical physics.