Predictive coarse-graining

Abstract We propose a data-driven, coarse-graining formulation in the context of equilibrium statistical mechanics. In contrast to existing techniques which are based on a fine-to-coarse map, we adopt the opposite strategy by prescribing a probabilistic coarse-to-fine map. This corresponds to a directed probabilistic model where the coarse variables play the role of latent generators of the fine scale (all-atom) data. From an information-theoretic perspective, the framework proposed provides an improvement upon the relative entropy method [1] and is capable of quantifying the uncertainty due to the information loss that unavoidably takes place during the coarse-graining process. Furthermore, it can be readily extended to a fully Bayesian model where various sources of uncertainties are reflected in the posterior of the model parameters. The latter can be used to produce not only point estimates of fine-scale reconstructions or macroscopic observables, but more importantly, predictive posterior distributions on these quantities. Predictive posterior distributions reflect the confidence of the model as a function of the amount of data and the level of coarse-graining. The issues of model complexity and model selection are seamlessly addressed by employing a hierarchical prior that favors the discovery of sparse solutions, revealing the most prominent features in the coarse-grained model. A flexible and parallelizable Monte Carlo – Expectation–Maximization (MC-EM) scheme is proposed for carrying out inference and learning tasks. A comparative assessment of the proposed methodology is presented for a lattice spin system and the SPC/E water model.

[1]  Jonathan V. Selinger,et al.  Introduction to the Theory of Soft Matter , 2015 .

[2]  Markos A. Katsoulakis,et al.  Information Loss in Coarse-Graining of Stochastic Particle Dynamics , 2006 .

[3]  J. Oden,et al.  Calibration and validation of coarse-grained models of atomic systems: application to semiconductor manufacturing , 2014 .

[4]  Alexander Fischer,et al.  Identification of biomolecular conformations from incomplete torsion angle observations by hidden markov models , 2007, J. Comput. Chem..

[5]  Radek Erban,et al.  Coupling all-atom molecular dynamics simulations of ions in water with Brownian dynamics , 2015, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[6]  Benoît Roux,et al.  The Theory of Ultra-Coarse-Graining. 1. General Principles. , 2013, Journal of chemical theory and computation.

[7]  Yuqing Qiu,et al.  Coarse-Graining of TIP4P/2005, TIP4P-Ew, SPC/E, and TIP3P to Monatomic Anisotropic Water Models Using Relative Entropy Minimization. , 2014, Journal of chemical theory and computation.

[8]  Jorge Nocedal,et al.  On the Use of Stochastic Hessian Information in Optimization Methods for Machine Learning , 2011, SIAM J. Optim..

[9]  Petr Plechác,et al.  Multibody Interactions in Coarse-Graining Schemes for Extended Systems , 2008, SIAM J. Sci. Comput..

[10]  Petr Plechác,et al.  Numerical and Statistical Methods for the Coarse-Graining of Many-Particle Stochastic Systems , 2008, J. Sci. Comput..

[11]  Peter G. Kusalik,et al.  The Spatial Structure in Liquid Water , 1994, Science.

[12]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .

[13]  Christopher M. Bishop,et al.  Variational Relevance Vector Machines , 2000, UAI.

[14]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Ilias Bilionis,et al.  Free energy computations by minimization of Kullback-Leibler divergence: An efficient adaptive biasing potential method for sparse representations , 2010, J. Comput. Phys..

[16]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[17]  Alexandros Sopasakis,et al.  Error Analysis of Coarse-Graining for Stochastic Lattice Dynamics , 2006, SIAM J. Numer. Anal..

[18]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[19]  S. Nosé A unified formulation of the constant temperature molecular dynamics methods , 1984 .

[20]  G. Stoltz,et al.  THEORETICAL AND NUMERICAL COMPARISON OF SOME SAMPLING METHODS FOR MOLECULAR DYNAMICS , 2007 .

[21]  Anna Walsh STUDIES IN MOLECULAR DYNAMICS , 1965 .

[22]  Costas Papadimitriou,et al.  Data driven, predictive molecular dynamics for nanoscale flow simulations under uncertainty. , 2013, The journal of physical chemistry. B.

[23]  J. Loehlin Latent variable models , 1987 .

[24]  Hoover,et al.  Canonical dynamics: Equilibrium phase-space distributions. , 1985, Physical review. A, General physics.

[25]  J. Tinsley Oden,et al.  Selection, calibration, and validation of coarse-grained models of atomistic systems , 2015 .

[26]  Weber,et al.  Computer simulation of local order in condensed phases of silicon. , 1985, Physical review. B, Condensed matter.

[27]  Petr Plechác,et al.  Information-theoretic tools for parametrized coarse-graining of non-equilibrium extended systems , 2013, The Journal of chemical physics.

[28]  G. V. Chester,et al.  Solid State Physics , 2000 .

[29]  Markus J. Buehler,et al.  Atomistic Modeling of Materials Failure , 2008 .

[30]  Hiroshi Noguchi,et al.  Transport coefficients of off-lattice mesoscale-hydrodynamics simulation techniques. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  J. Berg,et al.  Molecular dynamics simulations of biomolecules , 2002, Nature Structural Biology.

[32]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[33]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[34]  Kurt Kremer,et al.  Simulation of Polymer Melts. II. From Coarse-Grained Models Back to Atomistic Description , 1998 .

[35]  G. Fort,et al.  Convergence of Markovian Stochastic Approximation with Discontinuous Dynamics , 2014, SIAM J. Control. Optim..

[36]  James B. Adams,et al.  Interatomic Potentials from First-Principles Calculations: The Force-Matching Method , 1993, cond-mat/9306054.

[37]  Maurizio Dapor Monte Carlo Strategies , 2020, Transport of Energetic Electrons in Solids.

[38]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[39]  Peng Chen,et al.  Uncertainty propagation using infinite mixture of Gaussian processes and variational Bayesian inference , 2015, J. Comput. Phys..

[40]  M Scott Shell,et al.  Coarse-graining errors and numerical optimization using a relative entropy framework. , 2011, The Journal of chemical physics.

[41]  Faming Liang,et al.  A double Metropolis–Hastings sampler for spatial models with intractable normalizing constants , 2010 .

[42]  Costas Papadimitriou,et al.  Bayesian uncertainty quantification and propagation in molecular dynamics simulations: a high performance computing framework. , 2012, The Journal of chemical physics.

[43]  M Scott Shell,et al.  Tetrahedrality and structural order for hydrophobic interactions in a coarse-grained water model. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[44]  T. Straatsma,et al.  THE MISSING TERM IN EFFECTIVE PAIR POTENTIALS , 1987 .

[45]  Gregory A Voth,et al.  Multiscale coarse graining of liquid-state systems. , 2005, The Journal of chemical physics.

[46]  H. Kushner,et al.  Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[47]  Michael I. Jordan Graphical Models , 2003 .

[48]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[49]  George Casella,et al.  Implementations of the Monte Carlo EM Algorithm , 2001 .

[50]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[51]  Kurt Kremer,et al.  Comparative atomistic and coarse-grained study of water: What do we lose by coarse-graining? , 2009, The European physical journal. E, Soft matter.

[52]  Michael I. Jordan,et al.  A Linearly-Convergent Stochastic L-BFGS Algorithm , 2015, AISTATS.

[53]  Matthew West,et al.  Bayesian factor regression models in the''large p , 2003 .

[54]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[55]  P. Moral Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications , 2004 .

[56]  Joseph F. Rudzinski,et al.  A generalized-Yvon-Born-Green method for coarse-grained modeling , 2015 .

[57]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[58]  Dimitrios K. Tsagkarogiannis,et al.  From Mesoscale Back to Microscale: Reconstruction Schemes for Coarse-Grained Stochastic Lattice Systems , 2010, SIAM J. Numer. Anal..

[59]  R. Swendsen Monte Carlo Renormalization Group , 2011 .

[60]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[61]  C. Clementi,et al.  Discovering mountain passes via torchlight: methods for the definition of reaction coordinates and pathways in complex macromolecular reactions. , 2013, Annual review of physical chemistry.

[62]  Ilias Bilionis,et al.  A stochastic optimization approach to coarse-graining using a relative-entropy framework. , 2013, The Journal of chemical physics.

[63]  Dionisios G. Vlachos,et al.  Multilevel coarse graining and nano-pattern discovery in many particle stochastic systems , 2012, J. Comput. Phys..

[64]  M. Scott Shell,et al.  The impact of resolution upon entropy and information in coarse-grained models. , 2015, The Journal of chemical physics.

[65]  John Morrissey,et al.  Data driven. , 2019, Hospitals & health networks.

[66]  Kurt Kremer,et al.  Simulation of polymer melts. I. Coarse‐graining procedure for polycarbonates , 1998 .

[67]  B. Alder,et al.  Studies in Molecular Dynamics. I. General Method , 1959 .

[68]  Abhijit Chatterjee,et al.  Spatially adaptive lattice coarse-grained Monte Carlo simulations for diffusion of interacting molecules. , 2004, The Journal of chemical physics.

[69]  Kurt Kremer,et al.  Multiscale simulation of soft matter systems – from the atomistic to the coarse-grained level and back , 2009 .

[70]  T. Lelièvre,et al.  Free Energy Computations: A Mathematical Perspective , 2010 .

[71]  Alexander Lukyanov,et al.  Versatile Object-Oriented Toolkit for Coarse-Graining Applications. , 2009, Journal of chemical theory and computation.

[72]  Katherine A. Heller,et al.  Bayesian Exponential Family PCA , 2008, NIPS.

[73]  M. Wall,et al.  Allostery in a coarse-grained model of protein dynamics. , 2005, Physical review letters.

[74]  Carsten Hartmann,et al.  Data-based parameter estimation of generalized multidimensional Langevin processes. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[75]  Zhe Gan,et al.  Bridging the Gap between Stochastic Gradient MCMC and Stochastic Optimization , 2015, AISTATS.

[76]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[77]  M Scott Shell,et al.  The relative entropy is fundamental to multiscale and inverse thermodynamic problems. , 2008, The Journal of chemical physics.

[78]  Christopher M. Bishop,et al.  Bayesian Hierarchical Mixtures of Experts , 2002, UAI.

[79]  Andrew J. Majda,et al.  Coarse-grained stochastic processes and Monte Carlo simulations in lattice systems , 2003 .

[80]  Alexey Savelyev,et al.  Molecular renormalization group coarse-graining of polymer chains: application to double-stranded DNA. , 2009, Biophysical journal.

[81]  G. Voth Coarse-Graining of Condensed Phase and Biomolecular Systems , 2008 .

[82]  Teresa Head-Gordon,et al.  The structure of ambient water , 2010 .

[83]  L. Younes On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates , 1999 .

[84]  Dirk Reith,et al.  Deriving effective mesoscale potentials from atomistic simulations , 2002, J. Comput. Chem..

[85]  J. Booth,et al.  Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm , 1999 .

[86]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[87]  A. Lyubartsev,et al.  Calculation of effective interaction potentials from radial distribution functions: A reverse Monte Carlo approach. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[88]  H. Robbins A Stochastic Approximation Method , 1951 .

[89]  Joseph F Rudzinski,et al.  Coarse-graining entropy, forces, and structures. , 2011, The Journal of chemical physics.

[90]  W G Noid,et al.  Perspective: Coarse-grained models for biomolecular systems. , 2013, The Journal of chemical physics.

[91]  Gersende Fort,et al.  Convergence of the Monte Carlo expectation maximization for curved exponential families , 2003 .

[92]  Eric Moulines,et al.  Stability of Stochastic Approximation under Verifiable Conditions , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[93]  Gregory A Voth,et al.  Multiscale coarse-graining and structural correlations: connections to liquid-state theory. , 2007, The journal of physical chemistry. B.

[94]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[95]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[96]  Richard A. Levine,et al.  An automated (Markov chain) Monte Carlo EM algorithm , 2004 .

[97]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[98]  Eric Moulines,et al.  Inference in hidden Markov models , 2010, Springer series in statistics.

[99]  Christopher M. Bishop Latent Variable Models , 1998, Learning in Graphical Models.

[100]  L. Goddard Information Theory , 1962, Nature.

[101]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[102]  Xavier Barril,et al.  Molecular modelling. , 2006, Molecular bioSystems.