Best Practices for Quantification of Uncertainty and Sampling Quality in Molecular Simulations [Article v1.0].

The quantitative assessment of uncertainty and sampling quality is essential in molecular simulation. Many systems of interest are highly complex, often at the edge of current computational capabilities. Modelers must therefore analyze and communicate statistical uncertainties so that "consumers" of simulated data understand its significance and limitations. This article covers key analyses appropriate for trajectory data generated by conventional simulation methods such as molecular dynamics and (single Markov chain) Monte Carlo. It also provides guidance for analyzing some 'enhanced' sampling approaches. We do not discuss systematic errors arising, e.g., from inaccuracy in the chosen model or force field.

[1]  G. Huber,et al.  Weighted-ensemble Brownian dynamics simulations for protein association reactions. , 1996, Biophysical journal.

[2]  Jeffry D. Madura,et al.  A Review of Coarse-Grained Molecular Dynamics Techniques to Access Extended Spatial and Temporal Scales in Biomolecular Simulations , 2011 .

[3]  M. H. Quenouille Approximate Tests of Correlation in Time‐Series , 1949 .

[4]  Divesh Bhatt,et al.  Steady state via weighted ensemble path sampling , 2009 .

[5]  Daniel R. Roe,et al.  Evaluation of Enhanced Sampling Provided by Accelerated Molecular Dynamics with Hamiltonian Replica Exchange Methods , 2014, The journal of physical chemistry. B.

[6]  Grubmüller,et al.  Predicting slow structural transitions in macromolecular systems: Conformational flooding. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[7]  F. RIZZI,et al.  Uncertainty Quantification in MD Simulations. Part I: Forward Propagation , 2012, Multiscale Model. Simul..

[8]  Andrew Dienstfrey,et al.  UNCERTAINTY QUANTIFICATION FOR MOLECULAR DYNAMICS , 2018, Reviews in Computational Chemistry.

[9]  Richard Gowers,et al.  Automated analysis and benchmarking of GCMC simulation programs in application to gas adsorption , 2018 .

[10]  Paul N. Patrone,et al.  Beyond histograms: efficiently estimating radial distribution functions via spectral Monte Carlo. , 2016, The Journal of chemical physics.

[11]  T. Chou,et al.  Non-equilibrium statistical mechanics: from a paradigmatic model to biological transport , 2011, 1110.1783.

[12]  Ralph C. Smith,et al.  Uncertainty Quantification: Theory, Implementation, and Applications , 2013 .

[13]  Mark J Abraham,et al.  Ensuring Mixing Efficiency of Replica-Exchange Molecular Dynamics Simulations. , 2008, Journal of chemical theory and computation.

[14]  Christophe Chipot,et al.  The Adaptive Biasing Force Method: Everything You Always Wanted To Know but Were Afraid To Ask , 2014, The journal of physical chemistry. B.

[15]  Wei Yang,et al.  Free energy simulations: use of reverse cumulative averaging to determine the equilibrated region and the time required for convergence. , 2004, The Journal of chemical physics.

[16]  Berend Smit,et al.  Understanding molecular simulation: from algorithms to applications , 1996 .

[17]  Eric F Darve,et al.  Calculating free energies using average force , 2001 .

[18]  John H. Perepezko,et al.  Interdiffusion in the Ni-Re System: Evaluation of Uncertainties , 2017 .

[19]  Alan Grossfield,et al.  Quantifying uncertainty and sampling quality in biomolecular simulations. , 2009, Annual reports in computational chemistry.

[20]  B. Hess Convergence of sampling in protein simulations. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Hans Hasse,et al.  Round Robin Study: Molecular Simulation of Thermodynamic Properties from Models with Internal Degrees of Freedom. , 2017, Journal of chemical theory and computation.

[22]  Y. Sugita,et al.  Replica-exchange molecular dynamics method for protein folding , 1999 .

[23]  James J. Filliben,et al.  NIST/SEMATECH e-Handbook of Statistical Methods; Chapter 1: Exploratory Data Analysis , 2003 .

[24]  Wolfhard Janke,et al.  Statistical Analysis of Simulations: Data Correlations and Error Estimation , 2002 .

[25]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[26]  G. Karniadakis,et al.  Nature of intrinsic uncertainties in equilibrium molecular dynamics estimation of shear viscosity for simple and complex fluids. , 2018, The Journal of chemical physics.

[27]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[28]  Daniel M. Zuckerman,et al.  Simultaneous Computation of Dynamical and Equilibrium Information using a Weighted Ensemble of Trajectories , 2014 .

[29]  Daniel M Zuckerman,et al.  Automated sampling assessment for molecular simulations using the effective sample size. , 2010, Journal of chemical theory and computation.

[30]  R. Friedberg,et al.  Test of the Monte Carlo Method: Fast Simulation of a Small Ising Lattice , 1970 .

[31]  Paul N. Patrone,et al.  Estimating yield-strain via deformation-recovery simulations. , 2016, Polymer.

[32]  Anthony J. Kearsley,et al.  The Role of Data Analysis in Uncertainty Quantification: Case Studies for Materials Modeling , 2017, 1712.01900.

[33]  A. Kolmogoroff Zur Theorie der Markoffschen Ketten , 1936 .

[34]  M. Chernick,et al.  Revisiting Qualms about Bootstrap Confidence Intervals , 2009 .

[35]  Khachik Sargsyan,et al.  Uncertainty Quantification in MD Simulations. Part II: Bayesian Inference of Force-Field Parameters , 2012, Multiscale Model. Simul..

[36]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[37]  John D Chodera,et al.  A Simple Method for Automated Equilibration Detection in Molecular Simulations. , 2016, Journal of chemical theory and computation.

[38]  Daniel M Zuckerman,et al.  The "weighted ensemble" path sampling method is statistically exact for a broad class of stochastic processes and binning procedures. , 2008, The Journal of chemical physics.

[39]  Michael R. Shirts,et al.  Statistically optimal analysis of samples from multiple equilibrium states. , 2008, The Journal of chemical physics.

[40]  David L. Mobley,et al.  Guidelines for the analysis of free energy calculations , 2015, Journal of Computer-Aided Molecular Design.

[41]  Alan Grossfield,et al.  Lightweight object oriented structure analysis: Tools for building tools to analyze molecular dynamics simulations , 2014, J. Comput. Chem..

[42]  A. Laio,et al.  Equilibrium free energies from nonequilibrium metadynamics. , 2006, Physical Review Letters.

[43]  J. Pitera Expected distributions of root-mean-square positional deviations in proteins. , 2014, The journal of physical chemistry. B.

[44]  A. Kolinski,et al.  Coarse-Grained Protein Models and Their Applications. , 2016, Chemical reviews.

[45]  Daniel R Roe,et al.  PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data. , 2013, Journal of chemical theory and computation.

[46]  Adrian E. Roitberg,et al.  Multidimensional Replica Exchange Molecular Dynamics Yields a Converged Ensemble of an RNA Tetranucleotide , 2013, Journal of chemical theory and computation.

[47]  Anthony Nicholls,et al.  Confidence limits, error bars and method comparison in molecular modeling. Part 1: The calculation of confidence intervals , 2014, Journal of Computer-Aided Molecular Design.

[48]  Daniel Hoffmann,et al.  Quantitative Assessment of Molecular Dynamics Sampling for Flexible Systems. , 2017, Journal of chemical theory and computation.

[49]  Niel M. Henriksen,et al.  Reliable oligonucleotide conformational ensemble generation in explicit solvent for force field assessment using reservoir replica exchange molecular dynamics simulations. , 2013, The journal of physical chemistry. B.

[50]  Daniel M Zuckerman,et al.  Equilibrium sampling in biomolecular simulations. , 2011, Annual review of biophysics.

[51]  Jianyin Shao,et al.  Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms. , 2007, Journal of chemical theory and computation.

[52]  Michèle B. Nuijten,et al.  Five ways to fix statistics , 2017, Nature.

[53]  R. Jones,et al.  Uncertainty quantification in MD simulations of concentration driven ionic flow through a silica nanopore. II. Uncertain potential parameters. , 2013, The Journal of chemical physics.

[54]  Nathaniel Schenker,et al.  Qualms about Bootstrap Confidence Intervals , 1985 .

[55]  Asim Okur,et al.  Improved Efficiency of Replica Exchange Simulations through Use of a Hybrid Explicit/Implicit Solvation Model. , 2006, Journal of chemical theory and computation.

[56]  Eric Darve,et al.  Adaptive biasing force method for scalar and vector free energy calculations. , 2008, The Journal of chemical physics.

[57]  A. Grossfield,et al.  Retinal Conformation Changes Rhodopsin's Dynamic Ensemble. , 2015, Biophysical journal.

[58]  Tod D Romo,et al.  Block Covariance Overlap Method and Convergence in Molecular Dynamics Simulation. , 2011, Journal of chemical theory and computation.

[59]  M. H. Quenouille NOTES ON BIAS IN ESTIMATION , 1956 .

[60]  J. Kolafa Autocorrelations and subseries averages in Monte Carlo Simulations , 1986 .

[61]  H. G. Petersen,et al.  Error estimates on averages of correlated data , 1989 .

[62]  Andrew E. Torda,et al.  Local elevation: A method for improving the searching properties of molecular dynamics simulation , 1994, J. Comput. Aided Mol. Des..

[63]  Daniel M Zuckerman,et al.  On the structural convergence of biomolecular simulations by determination of the effective sample size. , 2007, The journal of physical chemistry. B.

[64]  A. Laio,et al.  Metadynamics: a method to simulate rare events and reconstruct the free energy in biophysics, chemistry and material science , 2008 .

[65]  Wang,et al.  Replica Monte Carlo simulation of spin glasses. , 1986, Physical review letters.

[66]  Paul N. Patrone,et al.  Uncertainty quantification in molecular dynamics studies of the glass transition temperature , 2016 .

[67]  B. Leimkuhler,et al.  Molecular Dynamics: With Deterministic and Stochastic Numerical Methods , 2015 .