Learning molecular dynamics with simple language model built upon long short-term memory neural network

Recurrent neural networks have led to breakthroughs in natural language processing and speech recognition. Here we show that recurrent networks, specifically long short-term memory networks can also capture the temporal evolution of chemical/biophysical trajectories. Our character-level language model learns a probabilistic model of 1-dimensional stochastic trajectories generated from higher-dimensional dynamics. The model captures Boltzmann statistics and also reproduces kinetics across a spectrum of timescales. We demonstrate how training the long short-term memory network is equivalent to learning a path entropy, and that its embedding layer, instead of representing contextual meaning of characters, here exhibits a nontrivial connectivity between different metastable states in the underlying physical system. We demonstrate our model’s reliability through different benchmark systems and a force spectroscopy trajectory for multi-state riboswitch. We anticipate that our work represents a stepping stone in the understanding and use of recurrent neural networks for understanding the dynamics of complex stochastic molecular systems.

[1]  Frank Noé,et al.  PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. , 2015, Journal of chemical theory and computation.

[2]  Nils G Walter,et al.  Provided for Non-commercial Research and Educational Use Only. Not for Reproduction, Distribution or Commercial Use. Analysis of Complex Single-molecule Fret Time Trajectories Author's Personal Copy , 2022 .

[3]  C. Dellago,et al.  Reaction coordinates of biomolecular isomerization. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Alessandro Laio,et al.  Using metadynamics to explore complex free-energy landscapes , 2020 .

[5]  Vijay S Pande,et al.  Progress and challenges in the automated construction of Markov state models for full protein systems. , 2009, The Journal of chemical physics.

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Jakub Rydzewski,et al.  Promoting transparency and reproducibility in enhanced molecular simulations , 2019, Nature Methods.

[8]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[9]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[10]  Brett Naul,et al.  Revealing ferroelectric switching character using deep recurrent neural networks , 2019, Nature Communications.

[11]  Calvin C. Moore,et al.  Ergodic theorem, ergodic theory, and statistical mechanics , 2015, Proceedings of the National Academy of Sciences.

[12]  Mehrdad Mokhtari,et al.  Recurrent Neural Network-based Model for Accelerated Trajectory Analysis in AIMD Simulations , 2019, ArXiv.

[13]  Yihang Wang,et al.  Past–future information bottleneck for sampling molecular reaction coordinate simultaneously with thermodynamics and kinetics , 2019, Nature Communications.

[14]  Larry Griffin,et al.  Stochastic simulations reveal few green wave surfing populations among spring migrating herbivorous waterfowl , 2019, Nature Communications.

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[16]  Jianxin Wu Hidden Markov model , 2018 .

[17]  Jaideep Pathak,et al.  Model-Free Prediction of Large Spatiotemporally Chaotic Systems from Data: A Reservoir Computing Approach. , 2018, Physical review letters.

[18]  Michael T. Woodside,et al.  Single-molecule force spectroscopy of the add adenine riboswitch relates folding to regulatory mechanism , 2011, Nucleic acids research.

[19]  Toni Giorgino,et al.  Identification of slow molecular order parameters for Markov model construction. , 2013, The Journal of chemical physics.

[20]  I. Kevrekidis,et al.  Noninvertibility and resonance in discrete-time neural networks for time-series processing , 1998 .

[21]  K. Dill,et al.  Principles of maximum entropy and maximum caliber in statistical physics , 2013 .

[22]  Ioannis G. Kevrekidis,et al.  DISCRETE- vs. CONTINUOUS-TIME NONLINEAR SIGNAL PROCESSING OF Cu ELECTRODISSOLUTION DATA , 1992 .

[23]  John E. Straub,et al.  Classical and modern methods in reaction rate theory , 1988 .

[24]  M. Kunitski,et al.  Double-slit photoelectron interference in strong-field ionization of the neon dimer , 2018, Nature Communications.

[25]  Herbert Jaeger,et al.  Reservoir computing approaches to recurrent neural network training , 2009, Comput. Sci. Rev..

[26]  Noah Constant,et al.  Character-Level Language Modeling with Deeper Self-Attention , 2018, AAAI.

[27]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[28]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[29]  S. McKinney,et al.  Analysis of single-molecule FRET trajectories using hidden Markov modeling. , 2006, Biophysical journal.

[30]  Pratyush Tiwary,et al.  Multi-dimensional spectral gap optimization of order parameters (SGOOP) through conditional probability factorization , 2018, bioRxiv.

[31]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[32]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[33]  Frank Noé,et al.  Markov state models of biomolecular conformational dynamics. , 2014, Current opinion in structural biology.

[34]  D. van der Spoel,et al.  GROMACS: A message-passing parallel molecular dynamics implementation , 1995 .

[35]  Aaron R Dinner,et al.  Automatic method for identifying reaction coordinates in complex systems. , 2005, The journal of physical chemistry. B.

[36]  Sun-Ting Tsai,et al.  On the distance between A and B in molecular configuration space , 2020 .

[37]  Junzhe Lu,et al.  Structural evolution and ligand effects of (Au1L2)n, (Au2L3)n, and (Au3L4)n [n = 1–3, L = SCH3,P(CH3)2,PH2,Cl] clusters , 2020, Molecular Simulation.

[38]  F. Noé,et al.  Commute Maps: Separating Slowly Mixing Molecular Configurations for Kinetic Modeling. , 2016, Journal of chemical theory and computation.

[39]  Qiang Cui,et al.  Faculty Opinions recommendation of Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning. , 2019, Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature.

[40]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[41]  F. Noé,et al.  Kinetic distance and kinetic maps from molecular dynamics simulation. , 2015, Journal of chemical theory and computation.

[42]  Hao Wu,et al.  Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning , 2018, Science.

[43]  Kai Chen,et al.  A LSTM-based method for stock returns prediction: A case study of China stock market , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[44]  M. Parrinello,et al.  Accurate sampling using Langevin dynamics. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[45]  Hermann Ney,et al.  LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[46]  C. Read From Art to Science , 2014 .

[47]  Michele Parrinello,et al.  Assessing the Reliability of the Dynamics Reconstructed from Metadynamics. , 2014, Journal of chemical theory and computation.

[48]  V. Pande,et al.  Markov State Models: From an Art to a Science. , 2018, Journal of the American Chemical Society.

[49]  Berk Hess,et al.  GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers , 2015 .

[50]  M. Parrinello,et al.  Canonical sampling through velocity rescaling. , 2007, The Journal of chemical physics.

[51]  Michele Parrinello,et al.  Enhancing Important Fluctuations: Rare Events and Metadynamics from a Conceptual Viewpoint. , 2016, Annual review of physical chemistry.

[52]  Andrew W. Senior,et al.  Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.

[53]  P. Hänggi,et al.  Reaction-rate theory: fifty years after Kramers , 1990 .

[54]  Andrew L. Ferguson,et al.  Molecular latent space simulators , 2020, Chemical science.