FAST Conformational Searches by Balancing Exploration/Exploitation Trade-Offs.

Molecular dynamics simulations are a powerful means of understanding conformational changes. However, it is still difficult to simulate biologically relevant time scales without the use of specialized supercomputers. Here, we introduce a goal-oriented sampling method, called fluctuation amplification of specific traits (FAST), for extending the capabilities of commodity hardware. This algorithm rapidly searches conformational space for structures with desired properties by balancing trade-offs between focused searches around promising solutions (exploitation) and trying novel solutions (exploration). FAST was inspired by the hypothesis that many physical properties have an overall gradient in conformational space, akin to the energetic gradients that are known to guide proteins to their folded states. For example, we expect that transitioning from a conformation with a small solvent-accessible surface area to one with a large surface area will require passing through a series of conformations with steadily increasing surface areas. We demonstrate that such gradients are common through retrospective analysis of existing Markov state models (MSMs). Then we design the FAST algorithm to exploit these gradients to find structures with desired properties by (1) recognizing and amplifying structural fluctuations along gradients that optimize a selected physical property whenever possible, (2) overcoming barriers that interrupt these overall gradients, and (3) rerouting to discover alternative paths when faced with insurmountable barriers. To test FAST, we compare its performance to other methods for three common types of problems: (1) identifying unexpected binding pockets, (2) discovering the preferred paths between specific structures, and (3) folding proteins. Our conservative estimate is that FAST outperforms conventional simulations and an adaptive sampling algorithm by at least an order of magnitude. Furthermore, FAST yields both the proper thermodynamics and kinetics, allowing for a direct connection with kinetic experiments that is impossible with many other advanced sampling algorithms because they provide only thermodynamic information. Therefore, we expect FAST to be of great utility for a wide range of applications.

[1]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[2]  Kyle A. Beauchamp,et al.  Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1-39). , 2010, Journal of the American Chemical Society.

[3]  S. Doerr,et al.  On-the-Fly Learning and Sampling of Ligand Binding by High-Throughput Molecular Simulations. , 2014, Journal of chemical theory and computation.

[4]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[5]  Frank Noé,et al.  An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation , 2014, Advances in Experimental Medicine and Biology.

[6]  Vijay S Pande,et al.  Progress and challenges in the automated construction of Markov state models for full protein systems. , 2009, The Journal of chemical physics.

[7]  Kathryn M Hart,et al.  Discovery of multiple hidden allosteric sites by combining Markov state models and experiments , 2015, Proceedings of the National Academy of Sciences.

[8]  Frank Noé,et al.  Markov models of molecular kinetics: generation and validation. , 2011, The Journal of chemical physics.

[9]  Csaba Szepesvári,et al.  Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..

[10]  David P. Anderson,et al.  High-Throughput All-Atom Molecular Dynamics Simulations Using Distributed Computing , 2010, J. Chem. Inf. Model..

[11]  G. Bowman,et al.  Equilibrium fluctuations of a single folded protein reveal a multitude of potential cryptic allosteric sites , 2012, Proceedings of the National Academy of Sciences.

[12]  K. Dill,et al.  From Levinthal to pathways to funnels , 1997, Nature Structural Biology.

[13]  Jeffrey K Weber,et al.  Characterization and rapid sampling of protein folding Markov state model topologies. , 2011, Journal of chemical theory and computation.

[14]  L. Chong,et al.  Simultaneous Computation of Dynamical and Equilibrium Information Using a Weighted Ensemble of Trajectories , 2012, Journal of chemical theory and computation.

[15]  Vijay S Pande,et al.  Enhanced modeling via network theory: Adaptive sampling of Markov state models. , 2010, Journal of chemical theory and computation.

[16]  L Wang,et al.  The early stage of folding of villin headpiece subdomain observed in a 200-nanosecond fully solvated molecular dynamics simulation. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[17]  M Hendlich,et al.  LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins. , 1997, Journal of molecular graphics & modelling.

[18]  Alex Dickson,et al.  WExplore: hierarchical exploration of high-dimensional spaces using the weighted ensemble algorithm. , 2014, The journal of physical chemistry. B.

[19]  William Swope,et al.  Describing Protein Folding Kinetics by Molecular Dynamics Simulations. 1. Theory , 2004 .

[20]  Gloria E Moyano,et al.  Molecular potential energy surfaces by interpolation: strategies for faster convergence. , 2004, The Journal of chemical physics.

[21]  G. Ciccotti,et al.  String method in collective variables: minimum free energy paths and isocommittor surfaces. , 2006, The Journal of chemical physics.

[22]  Albert C. Pan,et al.  Structural basis for modulation of a G-protein-coupled receptor by allosteric drugs , 2013, Nature.

[23]  Efficient hybrid non-equilibrium molecular dynamics--Monte Carlo simulations with symmetric momentum reversal. , 2014, The Journal of chemical physics.

[24]  K. Lindorff-Larsen,et al.  How robust are protein folding simulations with respect to force field parameterization? , 2011, Biophysical journal.

[25]  C L Brooks,et al.  Simulations of protein folding and unfolding. , 1998, Current opinion in structural biology.

[26]  R. Dror,et al.  How Fast-Folding Proteins Fold , 2011, Science.

[27]  Vijay S Pande,et al.  Protein folded states are kinetic hubs , 2010, Proceedings of the National Academy of Sciences.

[28]  Xuhui Huang,et al.  Using generalized ensemble simulations and Markov state models to identify conformational states. , 2009, Methods.

[29]  J. Mongan,et al.  Accelerated molecular dynamics: a promising and efficient simulation method for biomolecules. , 2004, The Journal of chemical physics.

[30]  A. Laio,et al.  Escaping free-energy minima , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[31]  J M Masson,et al.  Crystal structure of Escherichia coli TEM1 β‐lactamase at 1.8 Å resolution , 1993, Proteins.

[32]  Vincent A Voelz,et al.  Surprisal Metrics for Quantifying Perturbed Conformational Dynamics in Markov State Models. , 2014, Journal of chemical theory and computation.

[33]  Eric Vanden-Eijnden,et al.  Transition Path Theory for Markov Jump Processes , 2009, Multiscale Model. Simul..

[34]  V. Pande,et al.  Heterogeneity even at the speed limit of folding: large-scale molecular dynamics study of a fast-folding variant of the villin headpiece. , 2007, Journal of molecular biology.

[35]  Wei Yang,et al.  Random walk in orthogonal space to achieve efficient free-energy simulation of complex systems , 2008, Proceedings of the National Academy of Sciences.

[36]  Gregory R Bowman,et al.  Improved coarse-graining of Markov state models via explicit consideration of statistical uncertainty. , 2012, The Journal of chemical physics.

[37]  Vijay S. Pande,et al.  Screen Savers of the World Unite! , 2000, Science.

[38]  Vincent A. Voelz,et al.  Atomistic folding simulations of the five-helix bundle protein λ(6−85). , 2011, Journal of the American Chemical Society.

[39]  F. Noé Probability distributions of molecular observables computed from Markov models. , 2008, The Journal of chemical physics.

[40]  K. Schulten,et al.  Steered molecular dynamics and mechanical functions of proteins. , 2001, Current opinion in structural biology.

[41]  M. Parrinello,et al.  Canonical sampling through velocity rescaling. , 2007, The Journal of chemical physics.

[42]  A. Caflisch,et al.  A molecular simulation protocol to avoid sampling redundancy and discover new states. , 2015, Biochimica et biophysica acta.

[43]  V. Pande,et al.  Rapid equilibrium sampling initiated from nonequilibrium data , 2009, Proceedings of the National Academy of Sciences.

[44]  R. Altman,et al.  Cloud-based simulations on Google Exacycle reveal ligand-modulation of GPCR activation pathways , 2013, Nature chemistry.

[45]  Y. Sugita,et al.  Replica-exchange molecular dynamics method for protein folding , 1999 .

[46]  P. Kollman Advances and Continuing Challenges in Achieving Realistic and Predictive Simulations of the Properties of Organic and Biological Molecules , 1996 .

[47]  Robert T. McGibbon,et al.  MDTraj: a modern, open library for the analysis of molecular dynamics trajectories , 2014, bioRxiv.

[48]  Andrew E. Torda,et al.  Local elevation: A method for improving the searching properties of molecular dynamics simulation , 1994, J. Comput. Aided Mol. Des..

[49]  James R Horn,et al.  Allosteric inhibition through core disruption. , 2004, Journal of molecular biology.

[50]  D. Case,et al.  Exploring protein native states and large‐scale conformational changes with a modified generalized born model , 2004, Proteins.

[51]  W. E,et al.  Towards a Theory of Transition Paths , 2006 .

[52]  Berk Hess,et al.  P-LINCS:  A Parallel Linear Constraint Solver for Molecular Simulation. , 2008, Journal of chemical theory and computation.

[53]  G. Huber,et al.  Weighted-ensemble Brownian dynamics simulations for protein association reactions. , 1996, Biophysical journal.

[54]  Gerrit Groenhof,et al.  GROMACS: Fast, flexible, and free , 2005, J. Comput. Chem..

[55]  Albert C. Pan,et al.  Finding transition pathways using the string method with swarms of trajectories. , 2008, The journal of physical chemistry. B.

[56]  C. Schütte,et al.  Supplementary Information for “ Constructing the Equilibrium Ensemble of Folding Pathways from Short Off-Equilibrium Simulations ” , 2009 .

[57]  Thomas J Lane,et al.  MSMBuilder2: Modeling Conformational Dynamics at the Picosecond to Millisecond Scale. , 2011, Journal of chemical theory and computation.

[58]  J. Hofrichter,et al.  Sub-microsecond protein folding. , 2006, Journal of molecular biology.

[59]  Aashish N. Adhikari,et al.  Simplified protein models: predicting folding pathways and structure using amino acid sequences. , 2013, Physical review letters.

[60]  V. Pande,et al.  Calculation of the distribution of eigenvalues and eigenvectors in Markovian state models for molecular dynamics. , 2007, The Journal of chemical physics.

[61]  J. Onuchic,et al.  Theory of protein folding: the energy landscape perspective. , 1997, Annual review of physical chemistry.

[62]  Peter M. Kasson,et al.  GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit , 2013, Bioinform..

[63]  M. DePristo,et al.  Simultaneous determination of protein structure and dynamics , 2005, Nature.

[64]  Alberto Perez,et al.  Determining protein structures by combining semireliable data with atomistic physical models by Bayesian inference , 2015, Proceedings of the National Academy of Sciences.