Unbiased estimation of equilibrium, rates, and committors from Markov state model analysis

Markov state models (MSMs) have been broadly adopted for analyzing molecular dynamics trajectories, but the approximate nature of the models that results from coarse-graining into discrete states is a long-known limitation. We show theoretically that, despite the coarse graining, in principle MSM-like analysis can yield unbiased estimation of key observables. We describe unbiased estimators for equilibrium state populations, for the mean first-passage time (MFPT) of an arbitrary process, and for state committors – i.e., splitting probabilities. Generically, the estimators are only asymptotically unbiased but we describe how extension of a recently proposed reweighting scheme can accelerate relaxation to unbiased values. Exactly accounting for ‘sliding window’ averaging over finite-length trajectories is a key, novel element of our analysis. In general, our analysis indicates that coarse-grained MSMs are asymptotically unbiased for steady-state properties only when appropriate boundary conditions (e.g., source-sink for MFPT estimation) are applied directly to trajectories, prior to calculation of the appropriate transition matrix.

[1]  Cecilia Clementi,et al.  Markov state models from short non-equilibrium simulations—Analysis and correction of estimation bias , 2017, 1701.01665.

[2]  Hao Wu,et al.  Estimation and uncertainty of reversible Markov models. , 2015, The Journal of chemical physics.

[3]  D. Zuckerman,et al.  Computational Estimation of Microsecond to Second Atomistic Folding Times. , 2019, Journal of the American Chemical Society.

[4]  V. Pande,et al.  Markov State Models: From an Art to a Science. , 2018, Journal of the American Chemical Society.

[5]  Xuhui Huang,et al.  Using generalized ensemble simulations and Markov state models to identify conformational states. , 2009, Methods.

[6]  John D. Chodera,et al.  Bayesian hidden Markov model analysis of single-molecule force spectroscopy: Characterizing kinetics under measurement uncertainty , 2011, 1108.1430.

[7]  Juan M. Bello-Rivas,et al.  A Mathematical Framework for Exact Milestoning , 2015, Multiscale Model. Simul..

[8]  Frank Noé,et al.  What Markov state models can and cannot do: Correlation versus path-based observables in protein folding models , 2020, bioRxiv.

[9]  Frank Noé,et al.  Markov state models of biomolecular conformational dynamics. , 2014, Current opinion in structural biology.

[10]  Hao Wu,et al.  Combining experimental and simulation data of molecular processes via augmented Markov models , 2017, Proceedings of the National Academy of Sciences.

[11]  Ron Elber,et al.  Calculating Iso-Committor Surfaces as Optimal Reaction Coordinates with Milestoning , 2017, Entropy.

[12]  Hongbin Wan,et al.  Adaptive Markov state model estimation using short reseeding trajectories. , 2020, The Journal of chemical physics.

[13]  Aaron R Dinner,et al.  Umbrella sampling for nonequilibrium processes. , 2007, The Journal of chemical physics.

[14]  Eric F Darve,et al.  Analysis of the accelerated weighted ensemble methodology , 2013 .

[15]  A. Dinner,et al.  Nonequilibrium umbrella sampling in spaces of many order parameters. , 2009, The Journal of chemical physics.

[16]  F. Noé,et al.  Projected and hidden Markov models for calculating kinetics and metastable states of complex molecules. , 2013, The Journal of chemical physics.

[17]  John D. Chodera,et al.  Long-Time Protein Folding Dynamics from Short-Time Molecular Dynamics Simulations , 2006, Multiscale Model. Simul..

[18]  F. Noé,et al.  Efficient Bayesian estimation of Markov model transition matrices with given stationary distribution. , 2013, The Journal of chemical physics.

[19]  Jean-Claude Latombe,et al.  Stochastic roadmap simulation: an efficient representation and algorithm for analyzing molecular motion , 2002, RECOMB '02.

[20]  Frank Noé,et al.  Markov state models based on milestoning. , 2011, The Journal of chemical physics.

[21]  A. Dickson,et al.  On Calculating Free Energy Differences Using Ensembles of Transition Paths , 2020, Frontiers in Molecular Biosciences.

[22]  P. Gräber,et al.  Free Energy Transduction and Biochemical Cycle Kinetics. , 1990 .

[23]  Markov Models of Molecular Kinetics. , 2019, The Journal of chemical physics.

[24]  Ron Elber,et al.  Exact milestoning. , 2015, The Journal of chemical physics.

[25]  William Swope,et al.  Describing Protein Folding Kinetics by Molecular Dynamics Simulations. 1. Theory , 2004 .

[26]  Daniel M. Zuckerman,et al.  Accurate Estimation of Protein Folding and Unfolding Times: Beyond Markov State Models , 2016, Journal of chemical theory and computation.

[27]  Frank Noé,et al.  An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation , 2014, Advances in Experimental Medicine and Biology.

[28]  Daniel M. Zuckerman,et al.  Accelerated estimation of long-timescale kinetics from weighted ensemble simulation via non-Markovian "microbin" analysis. , 2020, Journal of chemical theory and computation.