A Perspective on Deep Learning for Molecular Modeling and Simulations.

Deep learning is transforming many areas in science, and it has great potential in modeling molecular systems. However, unlike the mature deployment of deep learning in computer vision and natural language processing, its development in molecular modeling and simulations is still at an early stage, largely because the inductive biases of molecules are completely different from those of images or texts. Footed on these differences, we first reviewed the limitations of traditional deep learning models from the perspective of molecular physics and wrapped up some relevant technical advancement at the interface between molecular modeling and deep learning. We do not focus merely on the ever more complex neural network models; instead, we introduce various useful concepts and ideas brought by modern deep learning. We hope that transacting these ideas into molecular modeling will create new opportunities. For this purpose, we summarized several representative applications, ranging from supervised to unsupervised and reinforcement learning, and discussed their connections with the emerging trends in deep learning. Finally, we give an outlook for promising directions which may help address the existing issues in the current framework of deep molecular modeling.

[1]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[2]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[3]  John D Chodera,et al.  On the Use of Experimental Observations to Bias Simulated Ensembles. , 2012, Journal of chemical theory and computation.

[4]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[5]  Pratyush Tiwary,et al.  Reweighted autoencoded variational Bayes for enhanced sampling (RAVE). , 2018, The Journal of chemical physics.

[6]  Jean-Louis Reymond,et al.  Virtual Exploration of the Chemical Universe up to 11 Atoms of C, N, O, F: Assembly of 26.4 Million Structures (110.9 Million Stereoisomers) and Analysis for New Ring Systems, Stereochemistry, Physicochemical Properties, Compound Classes, and Drug Discovery , 2007, J. Chem. Inf. Model..

[7]  K. Müller,et al.  Fast and accurate modeling of molecular atomization energies with machine learning. , 2011, Physical review letters.

[8]  H. Berendsen,et al.  Essential dynamics of proteins , 1993, Proteins.

[9]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[10]  R. Kondor,et al.  Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. , 2009, Physical review letters.

[11]  Zoubin Ghahramani,et al.  Deep Bayesian Active Learning with Image Data , 2017, ICML.

[12]  Dacheng Tao,et al.  Improving Training of Deep Neural Networks via Singular Value Bounding , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  M. Tuckerman Ab initio molecular dynamics: basic concepts, current trends and novel applications , 2002 .

[14]  Quoc V. Le,et al.  Swish: a Self-Gated Activation Function , 2017, 1710.05941.

[15]  Mark E. Tuckerman,et al.  Neural-Network-Based Path Collective Variables for Enhanced Sampling of Phase Transformations. , 2019, Physical review letters.

[16]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[17]  M. Parrinello,et al.  Well-tempered metadynamics: a smoothly converging and tunable free-energy method. , 2008, Physical review letters.

[18]  Sebastian Thrun,et al.  Learning to Learn: Introduction and Overview , 1998, Learning to Learn.

[19]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[20]  Igor Mordatch,et al.  Implicit Generation and Generalization with Energy Based Models , 2018 .

[21]  Heiga Zen,et al.  Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.

[22]  E. T. Jaynes,et al.  BAYESIAN METHODS: GENERAL BACKGROUND ? An Introductory Tutorial , 1986 .

[23]  Klaus-Robert Müller,et al.  Machine learning of accurate energy-conserving molecular force fields , 2016, Science Advances.

[24]  K-R Müller,et al.  SchNet - A deep learning architecture for molecules and materials. , 2017, The Journal of chemical physics.

[25]  Gregory A Voth,et al.  A Direct Method for Incorporating Experimental Data into Multiscale Coarse-Grained Models. , 2016, Journal of chemical theory and computation.

[26]  Marcin Andrychowicz,et al.  Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[27]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[28]  J S Smith,et al.  ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost , 2016, Chemical science.

[29]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[30]  Zoubin Ghahramani,et al.  Probabilistic machine learning and artificial intelligence , 2015, Nature.

[31]  David W Toth,et al.  The TensorMol-0.1 model chemistry: a neural network augmented with long-range physics , 2017, Chemical science.

[32]  Berend Smit,et al.  Understanding molecular simulation: from algorithms to applications , 1996 .

[33]  Michele Parrinello,et al.  Neural networks-based variationally enhanced sampling , 2019, Proceedings of the National Academy of Sciences.

[34]  Michele Parrinello,et al.  Variational approach to enhanced sampling and free energy calculations. , 2014, Physical review letters.

[35]  Emile H. L. Aarts,et al.  Boltzmann machines , 1998 .

[36]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[37]  Chao Yang,et al.  A Survey on Deep Transfer Learning , 2018, ICANN.

[38]  Anton van den Hengel,et al.  A Generative Adversarial Density Estimator , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[40]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[41]  Gregory A. Voth,et al.  The multiscale coarse-graining method. I. A rigorous bridge between atomistic and coarse-grained models. , 2008, The Journal of chemical physics.

[42]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[43]  J. Onuchic,et al.  Funnels, pathways, and the energy landscape of protein folding: A synthesis , 1994, Proteins.

[44]  Xu Ji,et al.  Invariant Information Clustering for Unsupervised Image Classification and Segmentation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[45]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[46]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[47]  Sepp Hochreiter,et al.  Learning to Learn Using Gradient Descent , 2001, ICANN.

[48]  Giovanni Bussi,et al.  Using the Maximum Entropy Principle to Combine Simulations and Solution Experiments , 2018, Comput..

[49]  G. Voth Coarse-Graining of Condensed Phase and Biomolecular Systems , 2008 .

[50]  Jun Zhang,et al.  Deep Representation Learning for Complex Free Energy Landscapes. , 2019, The journal of physical chemistry letters.

[51]  Xu Ji,et al.  Invariant Information Distillation for Unsupervised Image Segmentation and Clustering , 2018, ArXiv.

[52]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[53]  Mark E Tuckerman,et al.  Stochastic Neural Network Approach for Learning High-Dimensional Free Energy Surfaces. , 2017, Physical review letters.

[54]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[55]  M Scott Shell,et al.  The relative entropy is fundamental to multiscale and inverse thermodynamic problems. , 2008, The Journal of chemical physics.

[56]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[57]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[58]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[59]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[60]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[61]  Michele Parrinello,et al.  Molecular dynamics simulations of liquid silica crystallization , 2018, Proceedings of the National Academy of Sciences.

[62]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[63]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[65]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[66]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[67]  Yoshua Bengio,et al.  Série Scientifique Scientific Series Incorporating Second-order Functional Knowledge for Better Option Pricing Incorporating Second-order Functional Knowledge for Better Option Pricing , 2022 .

[68]  E Weinan,et al.  Reinforced dynamics for enhanced sampling in large atomic and molecular systems. I. Basic Methodology , 2017, The Journal of chemical physics.

[69]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[70]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[71]  Demis Hassabis,et al.  Improved protein structure prediction using potentials from deep learning , 2020, Nature.

[72]  Dan Wang,et al.  A new active labeling method for deep learning , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[73]  M Scott Shell,et al.  Coarse-graining errors and numerical optimization using a relative entropy framework. , 2011, The Journal of chemical physics.

[74]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[75]  Stephan Günnemann,et al.  Directional Message Passing for Molecular Graphs , 2020, ICLR.

[76]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[77]  Geoffrey E. Hinton,et al.  Self-organizing neural network that discovers surfaces in random-dot stereograms , 1992, Nature.

[78]  R. Kondor,et al.  On representing chemical environments , 2012, 1209.3140.

[79]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[80]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[81]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[82]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[83]  Andrew McCallum,et al.  Structured Prediction Energy Networks , 2015, ICML.

[84]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[85]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[86]  Chao Zhang,et al.  PiNN: A Python Library for Building Atomic Neural Networks of Molecules and Materials , 2020, J. Chem. Inf. Model..

[87]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[88]  Robert Babuska,et al.  A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[89]  Alexandr Andoni,et al.  Earth mover distance over high-dimensional spaces , 2008, SODA '08.

[90]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[91]  James L. McClelland,et al.  Parallel Distributed Processing: Explorations in the Microstructure of Cognition : Psychological and Biological Models , 1986 .

[92]  E Weinan,et al.  Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics , 2017, Physical review letters.

[93]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[94]  Debora S. Marks,et al.  Learning Protein Structure with a Differentiable Simulator , 2018, ICLR.

[95]  Klaus-Robert Müller,et al.  SchNet: A continuous-filter convolutional neural network for modeling quantum interactions , 2017, NIPS.

[96]  M. Kramer,et al.  Highly optimized embedded-atom-method potentials for fourteen fcc metals , 2011 .

[97]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.