Machine learning force fields and coarse-grained variables in molecular dynamics: application to materials and biological systems.

Machine learning encompasses a set of tools and algorithms which are now becoming popular in almost all scientific and technological fields. This is true for molecular dynamics as well, where machine learning offers promises of extracting valuable information from the enormous amounts of data generated by simulation of complex systems. We provide here a review of our current understanding of goals, benefits, and limitations of machine learning techniques for computational studies on atomistic systems, focusing on the construction of empirical force fields from ab-initio databases and the determination of reaction coordinates for free energy computation and enhanced sampling.

[1]  Rongjie Lai,et al.  Point Cloud Discretization of Fokker-Planck Operators for Committor Functions , 2017, Multiscale Model. Simul..

[2]  Michele Parrinello,et al.  Simplifying the representation of complex free-energy landscapes using sketch-map , 2011, Proceedings of the National Academy of Sciences.

[3]  Lexing Ying,et al.  Solving for high-dimensional committor functions using artificial neural networks , 2018, Research in the Mathematical Sciences.

[4]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[5]  Stefan Klus,et al.  Diffusion maps tailored to arbitrary non-degenerate Itô processes , 2017, Applied and Computational Harmonic Analysis.

[6]  R. Zwanzig Nonequilibrium statistical mechanics , 2001, Physics Subject Headings (PhySH).

[7]  Mohammad M. Sultan,et al.  Variational encoding of complex dynamics. , 2017, Physical review. E.

[8]  P. Collet,et al.  Quasi-Stationary Distributions: Markov Chains, Diffusions and Dynamical Systems , 2012 .

[9]  Cecilia Clementi,et al.  Rapid exploration of configuration space with diffusion-map-directed molecular dynamics. , 2013, The journal of physical chemistry. B.

[10]  Francesco Luigi Gervasio,et al.  From A to B in free energy space. , 2007, The Journal of chemical physics.

[11]  P. Deuflhard,et al.  Robust Perron cluster analysis in conformation dynamics , 2005 .

[12]  Markov Models of Molecular Kinetics. , 2019, The Journal of chemical physics.

[13]  Sébastien Maignan,et al.  SAR156497, an exquisitely selective inhibitor of aurora kinases. , 2014, Journal of medicinal chemistry.

[14]  F. Noé,et al.  Collective variables for the study of long-time kinetics from molecular trajectories: theory and methods. , 2017, Current opinion in structural biology.

[15]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[16]  Matthias Scholz,et al.  Nonlinear Principal Component Analysis: Neural Network Models and Applications , 2008 .

[17]  Marcus Weber,et al.  Fuzzy spectral clustering by PCCA+: application to Markov state models and data classification , 2013, Advances in Data Analysis and Classification.

[18]  Hao Wu,et al.  Data-Driven Model Reduction and Transfer Operator Approximation , 2017, J. Nonlinear Sci..

[19]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[20]  Clarence W. Rowley,et al.  A Data–Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition , 2014, Journal of Nonlinear Science.

[21]  Vijay S Pande,et al.  Improvements in Markov State Model Construction Reveal Many Non-Native Interactions in the Folding of NTL9. , 2013, Journal of chemical theory and computation.

[22]  Wei Chen,et al.  Nonlinear Discovery of Slow Molecular Modes using Hierarchical Dynamics Encoders , 2019, The Journal of chemical physics.

[23]  I. Mezić Spectral Properties of Dynamical Systems, Model Reduction and Decompositions , 2005 .

[24]  J. Preto,et al.  Fast recovery of free energy landscapes via diffusion-map-directed molecular dynamics. , 2014, Physical chemistry chemical physics : PCCP.

[25]  D. Donoho,et al.  Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Hiroshi Takano,et al.  Molecular Dynamics Study of Relaxation Modes of a Single Polymer Chain , 1997 .

[27]  Vojtěch Spiwok,et al.  Metadynamics in the conformational space nonlinearly dimensionally reduced by Isomap. , 2011, The Journal of chemical physics.

[28]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[29]  Ann B. Lee,et al.  Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Vijay S. Pande,et al.  Modeling Molecular Kinetics with tICA and the Kernel Trick , 2015, Journal of chemical theory and computation.

[31]  V. Schramm,et al.  Enzymatic transition states, transition-state analogs, dynamics, thermodynamics, and lifetimes. , 2011, Annual review of biochemistry.

[32]  Ioannis G Kevrekidis,et al.  Intrinsic map dynamics exploration for uncharted effective free-energy landscapes , 2016, Proceedings of the National Academy of Sciences.

[33]  Toni Giorgino,et al.  Identification of slow molecular order parameters for Markov model construction. , 2013, The Journal of chemical physics.

[34]  Ioannis G Kevrekidis,et al.  Integrating diffusion maps with umbrella sampling: application to alanine dipeptide. , 2011, The Journal of chemical physics.

[35]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[36]  Frank Noé,et al.  A Variational Approach to Modeling Slow Processes in Stochastic Dynamical Systems , 2012, Multiscale Model. Simul..

[37]  Erik H. Thiede,et al.  Galerkin approximation of dynamical quantities using trajectory data. , 2018, The Journal of chemical physics.

[38]  Zoe Cournia,et al.  Investigating the Structure and Dynamics of the PIK3CA Wild-Type and H1047R Oncogenic Mutant , 2014, PLoS Comput. Biol..

[39]  Jing Wang,et al.  MLLE: Modified Locally Linear Embedding Using Multiple Weights , 2006, NIPS.

[40]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[41]  Hao Wu,et al.  VAMPnets for deep learning of molecular kinetics , 2017, Nature Communications.

[42]  Lydia E Kavraki,et al.  Low-dimensional, free-energy landscapes of protein-folding reactions by nonlinear dimensionality reduction , 2006, Proc. Natl. Acad. Sci. USA.

[43]  D. Kern,et al.  Dynamic personalities of proteins , 2007, Nature.

[44]  Joshua B. Tenenbaum,et al.  Global Versus Local Methods in Nonlinear Dimensionality Reduction , 2002, NIPS.

[45]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[46]  Eric Vanden-Eijnden,et al.  On-the-fly free energy parameterization via temperature accelerated molecular dynamics. , 2012, Chemical physics letters.

[47]  Amir Barati Farimani,et al.  Machine Learning Harnesses Molecular Dynamics to Discover New $\mu$ Opioid Chemotypes , 2018 .

[48]  Zhen Yang,et al.  A Version of Isomap with Explicit Mapping , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[49]  F. Noé,et al.  Commute Maps: Separating Slowly Mixing Molecular Configurations for Kinetic Modeling. , 2016, Journal of chemical theory and computation.

[50]  Bert L. de Groot,et al.  Detection of Functional Modes in Protein Dynamics , 2009, PLoS Comput. Biol..

[51]  B. Nadler,et al.  Diffusion Maps - a Probabilistic Interpretation for Spectral Embedding and Clustering Algorithms , 2008 .

[52]  P. Nguyen,et al.  Complexity of free energy landscapes of peptides revealed by nonlinear principal component analysis , 2006, Proteins.

[53]  P. Deuflhard,et al.  A Direct Approach to Conformational Dynamics Based on Hybrid Monte Carlo , 1999 .

[54]  Weinan E,et al.  Sampling saddle points on a free energy surface. , 2014, The Journal of chemical physics.

[55]  Matteo T Degiacomi,et al.  Coupling Molecular Dynamics and Deep Learning to Mine Protein Conformational Space. , 2019, Structure.

[56]  M. Maggioni,et al.  Determination of reaction coordinates via locally scaled diffusion map. , 2011, The Journal of chemical physics.

[57]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[58]  F. Noé,et al.  Kinetic distance and kinetic maps from molecular dynamics simulation. , 2015, Journal of chemical theory and computation.

[59]  Diwakar Shukla,et al.  Reinforcement Learning Based Adaptive Sampling: REAPing Rewards by Exploring Protein Conformational Landscapes. , 2017, The journal of physical chemistry. B.

[60]  B. Keller,et al.  Girsanov reweighting for metadynamics simulations. , 2018, The Journal of chemical physics.

[61]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[62]  F. Takens Detecting strange attractors in turbulence , 1981 .

[63]  Frank Noé,et al.  Variational Approach to Molecular Kinetics. , 2014, Journal of chemical theory and computation.

[64]  Michele Parrinello,et al.  Using sketch-map coordinates to analyze and bias molecular dynamics simulations , 2012, Proceedings of the National Academy of Sciences.

[65]  Hiroshi Takano,et al.  Relaxation modes in random spin systems , 1995 .

[66]  Schuster,et al.  Separation of a mixture of independent signals using time delayed correlations. , 1994, Physical review letters.

[67]  Amir Barati Farimani,et al.  Binding Pathway of Opiates to μ-Opioid Receptors Revealed by Machine Learning , 2018, 1804.08206.

[68]  M. Weber,et al.  An Automatic Adaptive Importance Sampling Algorithm for Molecular Dynamics in Reaction Coordinates , 2018, SIAM J. Sci. Comput..

[69]  Z. Cournia,et al.  Exploring a non-ATP pocket for potential allosteric modulation of PI3Kα. , 2015, The journal of physical chemistry. B.

[70]  Gerhard Hummer,et al.  Position-dependent diffusion coefficients and free energies from Bayesian analysis of equilibrium and replica molecular dynamics simulations , 2005 .

[71]  L Donati,et al.  Girsanov reweighting for path ensembles and Markov state models. , 2017, The Journal of chemical physics.

[72]  Marino Arroyo,et al.  Modeling and enhanced sampling of molecular systems with smooth and nonlinear data-driven collective variables. , 2013, The Journal of chemical physics.

[73]  I. Kevrekidis,et al.  Coarse molecular dynamics of a peptide fragment: Free energy, kinetics, and long-time dynamics computations , 2002, physics/0212108.

[74]  Frank Noé,et al.  Hierarchical Time-Lagged Independent Component Analysis: Computing Slow Modes and Reaction Coordinates for Large Molecular Systems. , 2016, Journal of chemical theory and computation.

[75]  Vijay S. Pande,et al.  Everything you wanted to know about Markov State Models but were afraid to ask. , 2010, Methods.

[76]  J. Harlim,et al.  Variable Bandwidth Diffusion Kernels , 2014, 1406.5064.

[77]  Vijay S. Pande,et al.  Note: Variational Encoding of Protein Dynamics Benefits from Maximizing Latent Autocorrelation , 2018, The Journal of chemical physics.

[78]  Yuelei Sui Local Tangent Space Alignment , 2013 .

[79]  V. Schramm,et al.  Enzymatic transition states and transition state analogues. , 2005, Current opinion in structural biology.

[80]  Martin Held,et al.  Efficient Computation, Sensitivity, and Error Analysis of Committor Probabilities for Complex Dynamical Processes , 2011, Multiscale Model. Simul..