A Perspective on Deep Learning for Molecular Modeling and Simulations

Deep learning is transforming many areas in science, and it has great potential in modeling molecular systems. However, unlike the mature deployment of deep learning in computer vision and natural language processing, its development in molecular modeling and simulations is still at an early stage, largely because the inductive biases of molecules are completely different from those of images or texts. Footed on these differences, we first reviewed the limitations of traditional deep learning models from the perspective of molecular physics, and wrapped up some relevant technical advancement at the interface between molecular modeling and deep learning. We do not focus merely on the ever more complex neural network models, instead, we introduce various useful concepts and ideas brought by modern deep learning. We hope that transacting these ideas into molecular modeling will create new opportunities. For this purpose, we summarized several representative applications, ranging from supervised to unsupervised and reinforcement learning, and discussed their connections with the emerging trends in deep learning. Finally, we outlook promising directions which may help address the existing issues in the current framework of deep molecular modeling.

[1]  M Scott Shell,et al.  The relative entropy is fundamental to multiscale and inverse thermodynamic problems. , 2008, The Journal of chemical physics.

[2]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[3]  Ivo D. Dinov,et al.  Deep learning for neural networks , 2018 .

[4]  Anton van den Hengel,et al.  A Generative Adversarial Density Estimator , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Pratyush Tiwary,et al.  Reweighted autoencoded variational Bayes for enhanced sampling (RAVE). , 2018, The Journal of chemical physics.

[6]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[7]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[8]  Hao Wu,et al.  VAMPnets for deep learning of molecular kinetics , 2017, Nature Communications.

[9]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[10]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[11]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[12]  Quoc V. Le,et al.  Swish: a Self-Gated Activation Function , 2017, 1710.05941.

[13]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[14]  Dacheng Tao,et al.  Improving Training of Deep Neural Networks via Singular Value Bounding , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[16]  K-R Müller,et al.  SchNetPack: A Deep Learning Toolbox For Atomistic Systems. , 2018, Journal of chemical theory and computation.

[17]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[18]  Sepp Hochreiter,et al.  Learning to Learn Using Gradient Descent , 2001, ICANN.

[19]  Mark E. Tuckerman,et al.  Neural-Network-Based Path Collective Variables for Enhanced Sampling of Phase Transformations. , 2019, Physical review letters.

[20]  Giovanni Bussi,et al.  Using the Maximum Entropy Principle to Combine Simulations and Solution Experiments , 2018, Comput..

[21]  G. Voth Coarse-Graining of Condensed Phase and Biomolecular Systems , 2008 .

[22]  M. Parrinello,et al.  Well-tempered metadynamics: a smoothly converging and tunable free-energy method. , 2008, Physical review letters.

[23]  David W Toth,et al.  The TensorMol-0.1 model chemistry: a neural network augmented with long-range physics , 2017, Chemical science.

[24]  Berend Smit,et al.  Understanding molecular simulation: from algorithms to applications , 1996 .

[25]  Michele Parrinello,et al.  Neural networks-based variationally enhanced sampling , 2019, Proceedings of the National Academy of Sciences.

[26]  Michele Parrinello,et al.  Variational approach to enhanced sampling and free energy calculations. , 2014, Physical review letters.

[27]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[28]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[29]  Mark E Tuckerman,et al.  Stochastic Neural Network Approach for Learning High-Dimensional Free Energy Surfaces. , 2017, Physical review letters.

[30]  Alexandre Tkatchenko,et al.  Quantum-chemical insights from deep tensor neural networks , 2016, Nature Communications.

[31]  E. T. Jaynes,et al.  BAYESIAN METHODS: GENERAL BACKGROUND ? An Introductory Tutorial , 1986 .

[32]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[33]  J. Onuchic,et al.  Funnels, pathways, and the energy landscape of protein folding: A synthesis , 1994, Proteins.

[34]  Robert Babuska,et al.  A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[35]  Alexandr Andoni,et al.  Earth mover distance over high-dimensional spaces , 2008, SODA '08.

[36]  Frank Noé,et al.  Machine Learning of Coarse-Grained Molecular Dynamics Force Fields , 2018, ACS central science.

[37]  M. Tuckerman Ab initio molecular dynamics: basic concepts, current trends and novel applications , 2002 .

[38]  Julius Jellinek,et al.  Energy Landscapes: With Applications to Clusters, Biomolecules and Glasses , 2005 .

[39]  John D Chodera,et al.  On the Use of Experimental Observations to Bias Simulated Ensembles. , 2012, Journal of chemical theory and computation.

[40]  Heiga Zen,et al.  Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.

[41]  Demis Hassabis,et al.  Improved protein structure prediction using potentials from deep learning , 2020, Nature.

[42]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[43]  Jun Zhang,et al.  Deep Representation Learning for Complex Free Energy Landscapes. , 2019, The journal of physical chemistry letters.

[44]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[45]  Jean-Louis Reymond,et al.  Virtual Exploration of the Chemical Universe up to 11 Atoms of C, N, O, F: Assembly of 26.4 Million Structures (110.9 Million Stereoisomers) and Analysis for New Ring Systems, Stereochemistry, Physicochemical Properties, Compound Classes, and Drug Discovery , 2007, J. Chem. Inf. Model..

[46]  K. Müller,et al.  Fast and accurate modeling of molecular atomization energies with machine learning. , 2011, Physical review letters.

[47]  Mohammad Norouzi,et al.  Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs , 2017, ICML.

[48]  Jun Zhang,et al.  Reinforcement Learning for Multi-Scale Molecular Modeling , 2019 .

[49]  Tim Mueller,et al.  Machine learning for interatomic potential models. , 2020, The Journal of chemical physics.

[50]  Xu Ji,et al.  Invariant Information Distillation for Unsupervised Image Segmentation and Clustering , 2018, ArXiv.

[51]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[52]  Michele Parrinello,et al.  Molecular dynamics simulations of liquid silica crystallization , 2018, Proceedings of the National Academy of Sciences.

[53]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[54]  K-R Müller,et al.  SchNet - A deep learning architecture for molecules and materials. , 2017, The Journal of chemical physics.

[55]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[56]  Gregory A Voth,et al.  A Direct Method for Incorporating Experimental Data into Multiscale Coarse-Grained Models. , 2016, Journal of chemical theory and computation.

[57]  J. Behler Perspective: Machine learning potentials for atomistic simulations. , 2016, The Journal of chemical physics.

[58]  James L. McClelland,et al.  Parallel Distributed Processing: Explorations in the Microstructure of Cognition : Psychological and Biological Models , 1986 .

[59]  Gregory A. Voth,et al.  The multiscale coarse-graining method. I. A rigorous bridge between atomistic and coarse-grained models. , 2008, The Journal of chemical physics.

[60]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[61]  Markus Meuwly,et al.  PhysNet: A Neural Network for Predicting Energies, Forces, Dipole Moments, and Partial Charges. , 2019, Journal of chemical theory and computation.

[62]  Stephan Günnemann,et al.  Directional Message Passing for Molecular Graphs , 2020, ICLR.

[63]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[65]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[66]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[67]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[68]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[69]  Geoffrey E. Hinton,et al.  Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.

[70]  M. Kramer,et al.  Highly optimized embedded-atom-method potentials for fourteen fcc metals , 2011 .

[71]  K. Dill,et al.  Principles of maximum entropy and maximum caliber in statistical physics , 2013 .

[72]  R. Kondor,et al.  On representing chemical environments , 2012, 1209.3140.

[73]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[74]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[75]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[76]  H. Berendsen,et al.  Essential dynamics of proteins , 1993, Proteins.

[77]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[78]  Dan Wang,et al.  A new active labeling method for deep learning , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[79]  M Scott Shell,et al.  Coarse-graining errors and numerical optimization using a relative entropy framework. , 2011, The Journal of chemical physics.

[80]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[81]  Michele Parrinello,et al.  Entropy based fingerprint for local crystalline order. , 2017, The Journal of chemical physics.

[82]  R. Kondor,et al.  Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. , 2009, Physical review letters.

[83]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[84]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[85]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[86]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[87]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[88]  Sebastian Thrun,et al.  Learning to Learn: Introduction and Overview , 1998, Learning to Learn.

[89]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[90]  Igor Mordatch,et al.  Implicit Generation and Generalization with Energy Based Models , 2018 .

[91]  Chao Yang,et al.  A Survey on Deep Transfer Learning , 2018, ICANN.

[92]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[93]  Yoshua Bengio,et al.  Série Scientifique Scientific Series Incorporating Second-order Functional Knowledge for Better Option Pricing Incorporating Second-order Functional Knowledge for Better Option Pricing , 2022 .

[94]  E Weinan,et al.  Reinforced dynamics for enhanced sampling in large atomic and molecular systems. I. Basic Methodology , 2017, The Journal of chemical physics.

[95]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[96]  Andrew McCallum,et al.  Structured Prediction Energy Networks , 2015, ICML.

[97]  Mohammad M. Sultan,et al.  Automated design of collective variables using supervised machine learning. , 2018, The Journal of chemical physics.

[98]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[99]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[100]  Noam Bernstein,et al.  Machine learning unifies the modeling of materials and molecules , 2017, Science Advances.

[101]  Chao Zhang,et al.  PiNN: A Python Library for Building Atomic Neural Networks of Molecules and Materials , 2020, J. Chem. Inf. Model..

[102]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[103]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[104]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[105]  J S Smith,et al.  ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost , 2016, Chemical science.

[106]  E Weinan,et al.  Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics , 2017, Physical review letters.

[107]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[108]  Debora S. Marks,et al.  Learning Protein Structure with a Differentiable Simulator , 2018, ICLR.

[109]  Klaus-Robert Müller,et al.  SchNet: A continuous-filter convolutional neural network for modeling quantum interactions , 2017, NIPS.

[110]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[111]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .