Building Markov state models with solvent dynamics

BackgroundMarkov state models have been widely used to study conformational changes of biological macromolecules. These models are built from short timescale simulations and then propagated to extract long timescale dynamics. However, the solvent information in molecular simulations are often ignored in current methods, because of the large number of solvent molecules in a system and the indistinguishability of solvent molecules upon their exchange.MethodsWe present a solvent signature that compactly summarizes the solvent distribution in the high-dimensional data, and then define a distance metric between different configurations using this signature. We next incorporate the solvent information into the construction of Markov state models and present a fast geometric clustering algorithm which combines both the solute-based and solvent-based distances.ResultsWe have tested our method on several different molecular dynamical systems, including alanine dipeptide, carbon nanotube, and benzene rings. With the new solvent-based signatures, we are able to identify different solvent distributions near the solute. Furthermore, when the solute has a concave shape, we can also capture the water number inside the solute structure. Finally we have compared the performances of different Markov state models. The experiment results show that our approach improves the existing methods both in the computational running time and the metastability.ConclusionsIn this paper we have initiated an study to build Markov state models for molecular dynamical systems with solvent degrees of freedom. The methods we described should also be broadly applicable to a wide range of biomolecular simulation analyses.

[1]  M. Levitt,et al.  Computer simulation of protein folding , 1975, Nature.

[2]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[3]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[4]  David B. Shmoys,et al.  A Best Possible Heuristic for the k-Center Problem , 1985, Math. Oper. Res..

[5]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[6]  Ravindra K. Ahuja,et al.  Network Flows: Theory, Algorithms, and Applications , 1993 .

[7]  C. Brooks,et al.  Statistical clustering techniques for the analysis of long molecular dynamics trajectories: analysis of 2.2-ns trajectories of YPGDV. , 1993, Biochemistry.

[8]  C. Dellago,et al.  Reaction coordinates of biomolecular isomerization. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[9]  B. L. de Groot,et al.  Essential dynamics of reversible peptide folding: memory-free conformational dynamics governed by internal hydrogen bonds. , 2001, Journal of molecular biology.

[10]  G. Hummer,et al.  Water conduction through the hydrophobic channel of a carbon nanotube , 2001, Nature.

[11]  B. Berne,et al.  Dewetting-induced collapse of hydrophobic particles , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[12]  C. Dobson Protein folding and misfolding , 2003, Nature.

[13]  K. Dill,et al.  Using quaternions to calculate RMSD , 2004, J. Comput. Chem..

[14]  Vijay S Pande,et al.  Using path sampling to build better Markovian state models: predicting the folding rate and mechanism of a tryptophan zipper beta hairpin. , 2004, The Journal of chemical physics.

[15]  William Swope,et al.  Describing Protein Folding Kinetics by Molecular Dynamics Simulations. 2. Example Applications to Alanine Dipeptide and a β-Hairpin Peptide† , 2004 .

[16]  I. Kevrekidis,et al.  Coarse master equation from Bayesian analysis of replica molecular dynamics simulations. , 2005, The journal of physical chemistry. B.

[17]  BMC Bioinformatics , 2005 .

[18]  V. Pande,et al.  Does water play a structural role in the folding of small nucleic acids? , 2005, Biophysical journal.

[19]  Aaron R Dinner,et al.  Automatic method for identifying reaction coordinates in complex systems. , 2005, The journal of physical chemistry. B.

[20]  Eric J. Sorin,et al.  Exploring the helix-coil transition via all-atom equilibrium ensemble simulations. , 2005, Biophysical journal.

[21]  V. Pande,et al.  Foldamer dynamics expressed via Markov state models. II. State space decomposition. , 2005, The Journal of chemical physics.

[22]  R. Levy,et al.  Protein folding pathways from replica exchange simulations and a kinetic network model. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[23]  John D. Chodera,et al.  Long-Time Protein Folding Dynamics from Short-Time Molecular Dynamics Simulations , 2006, Multiscale Model. Simul..

[24]  B. Berne,et al.  Dynamics of water confined in the interdomain region of a multidomain protein. , 2006, The journal of physical chemistry. B.

[25]  K. Dill,et al.  Automatic discovery of metastable states for the construction of Markov models of macromolecular conformational dynamics. , 2007, The Journal of chemical physics.

[26]  F. Noé,et al.  Transition networks for modeling the kinetics of conformational change in macromolecules. , 2008, Current opinion in structural biology.

[27]  G. Hummer,et al.  Coarse master equations for peptide folding dynamics. , 2008, The journal of physical chemistry. B.

[28]  Vijay S Pande,et al.  Progress and challenges in the automated construction of Markov state models for full protein systems. , 2009, The Journal of chemical physics.

[29]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[30]  B. Berne,et al.  Dewetting and hydrophobic interaction in physical and biological systems. , 2009, Annual review of physical chemistry.

[31]  Xuhui Huang,et al.  Using generalized ensemble simulations and Markov state models to identify conformational states. , 2009, Methods.

[32]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[33]  Leonidas J. Guibas,et al.  Constructing Multi-Resolution Markov State Models (MSMs) to Elucidate RNA Hairpin Folding Mechanisms , 2010, Pacific Symposium on Biocomputing.

[34]  Leonidas J. Guibas,et al.  Kinetically-aware Conformational Distances in Molecular Dynamics , 2011, CCCG.