Accelerated Estimation of Long-timescale Kinetics by Combining Weighted Ensemble Simulation with Markov Model “Microstates” using Non-Markovian Theory

The weighted ensemble (WE) simulation strategy provides unbiased sampling of non-equilibrium processes, such as molecular folding or binding, but the extraction of rate constants relies on characterizing steady state behavior. Unfortunately, WE simulations of sufficiently complex systems will not relax to steady state on observed simulation times. Here we show that a post-simulation clustering of molecular configurations into "microbins" using methods developed in the Markov State Model (MSM) community, can yield unbiased kinetics from WE data before steady-state convergence of the WE simulation itself. Because WE trajectories are directional and not equilibrium-distributed, the history-augmented MSM (haMSM) formulation can be used, which yields the mean first-passage time (MFPT) without bias for arbitrarily small lag times. Accurate kinetics can be obtained while bypassing the often prohibitive convergence requirements of the non-equilibrium weighted ensemble. We validate the method in a simple diffusive process on a 2D random energy landscape, and then analyze atomistic protein folding simulations using WE molecular dynamics. We report significant progress towards the unbiased estimation of protein folding times and pathways, though key challenges remain.

[1]  Frank Noé,et al.  PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. , 2015, Journal of chemical theory and computation.

[2]  B. M. Fulk MATH , 1992 .

[3]  P. Alexander,et al.  Kinetic analysis of folding and unfolding the 56 amino acid IgG-binding domain of streptococcal protein G. , 1992, Biochemistry.

[4]  D. Zuckerman,et al.  Computational estimation of ms-sec atomistic folding times , 2018, bioRxiv.

[5]  Kyle A. Beauchamp,et al.  Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1-39). , 2010, Journal of the American Chemical Society.

[6]  David Aristoff,et al.  Analysis and optimization of weighted ensemble sampling , 2016, ESAIM: Mathematical Modelling and Numerical Analysis.

[7]  Daniel M. Zuckerman,et al.  Accurate Estimation of Protein Folding and Unfolding Times: Beyond Markov State Models , 2016, Journal of chemical theory and computation.

[8]  Frank Noé,et al.  Markov state models of biomolecular conformational dynamics. , 2014, Current opinion in structural biology.

[9]  Daniel M Zuckerman,et al.  Weighted Ensemble Simulation: Review of Methodology, Applications, and Software. , 2017, Annual review of biophysics.

[10]  Vijay S Pande,et al.  Using path sampling to build better Markovian state models: predicting the folding rate and mechanism of a tryptophan zipper beta hairpin. , 2004, The Journal of chemical physics.

[11]  Soon-Ho Park,et al.  Folding dynamics of the B1 domain of protein G explored by ultrarapid mixing , 1999, Nature Structural Biology.

[12]  J. Adelman,et al.  Simulating Current-Voltage Relationships for a Narrow Ion Channel Using the Weighted Ensemble Method. , 2015, Journal of chemical theory and computation.

[13]  R. McGibbon,et al.  Variational cross-validation of slow dynamical modes in molecular kinetics. , 2014, The Journal of chemical physics.

[14]  Jeremy C. Smith,et al.  Hierarchical analysis of conformational dynamics in biomolecules: transition networks of metastable states. , 2007, The Journal of chemical physics.

[15]  T. L. Hill,et al.  Free Energy Transduction and Biochemical Cycle Kinetics , 1988, Springer New York.

[16]  K. Freed,et al.  Long time dynamics of Met-enkephalin: comparison of explicit and implicit solvent models. , 2002, Biophysical journal.

[17]  Vijay S Pande,et al.  Note: MSM lag time cannot be used for variational model selection. , 2017, The Journal of chemical physics.

[18]  Bin W. Zhang,et al.  Steady-state simulations using weighted ensemble path sampling. , 2009, The Journal of chemical physics.

[19]  Daniel M Zuckerman,et al.  Estimating first‐passage time distributions from weighted ensemble simulations and non‐Markovian analyses , 2016, Protein science : a publication of the Protein Society.

[20]  Lillian T Chong,et al.  Protein–protein binding pathways and calculations of rate constants using fully-continuous, explicit-solvent simulations† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c8sc04811h , 2019, Chemical science.

[21]  Exact rate calculations by trajectory parallelization and tilting. , 2009, The Journal of chemical physics.

[22]  R. Dror,et al.  How Fast-Folding Proteins Fold , 2011, Science.

[23]  Frank Noé,et al.  Identification of kinetic order parameters for non-equilibrium dynamics. , 2018, The Journal of chemical physics.

[24]  L. Chong,et al.  Simultaneous Computation of Dynamical and Equilibrium Information Using a Weighted Ensemble of Trajectories , 2012, Journal of chemical theory and computation.

[25]  D. Zuckerman,et al.  Transient probability currents provide upper and lower bounds on non-equilibrium steady-state currents in the Smoluchowski picture. , 2018, The Journal of chemical physics.

[26]  G. Huber,et al.  Weighted-ensemble Brownian dynamics simulations for protein association reactions. , 1996, Biophysical journal.

[27]  James A. Warren,et al.  FiPy: Partial Differential Equations with Python , 2009, Computing in Science & Engineering.