Generating equilibrium molecules with deep neural networks

Discovery of atomistic systems with desirable properties is a major challenge in chemistry and material science. Here we introduce a novel, autoregressive, convolutional deep neural network architecture that generates molecular equilibrium structures by sequentially placing atoms in three-dimensional space. The model estimates the joint probability over molecular configurations with tractable conditional probabilities which only depend on distances between atoms and their nuclear charges. It combines concepts from state-of-the-art atomistic neural networks with auto-regressive generative models for images and speech. We demonstrate that the architecture is capable of generating molecules close to equilibrium for constitutional isomers of C$_7$O$_2$H$_{10}$.

[1]  Olexandr Isayev,et al.  Deep reinforcement learning for de novo drug design , 2017, Science Advances.

[2]  Li Li,et al.  Bypassing the Kohn-Sham equations with machine learning , 2016, Nature Communications.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Klaus-Robert Müller,et al.  Machine learning of accurate energy-conserving molecular force fields , 2016, Science Advances.

[5]  Steven Skiena,et al.  Syntax-Directed Variational Autoencoder for Structured Data , 2018, ICLR.

[6]  K-R Müller,et al.  SchNet - A deep learning architecture for molecules and materials. , 2017, The Journal of chemical physics.

[7]  Matt J. Kusner,et al.  Learning a Generative Model for Validity in Complex Discrete Structures , 2017, ICLR.

[8]  Jean-Louis Reymond,et al.  Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17 , 2012, J. Chem. Inf. Model..

[9]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[10]  R. Kondor,et al.  Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. , 2009, Physical review letters.

[11]  Alexandre Tkatchenko,et al.  Quantum-chemical insights from deep tensor neural networks , 2016, Nature Communications.

[12]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[13]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[14]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[15]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[16]  Eric J. Martin,et al.  In silico generation of novel, drug-like chemical matter using the LSTM neural network , 2017, ArXiv.

[17]  Burke,et al.  Generalized Gradient Approximation Made Simple. , 1996, Physical review letters.

[18]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[19]  J. Behler Atom-centered symmetry functions for constructing high-dimensional neural network potentials. , 2011, The Journal of chemical physics.

[20]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[21]  Matt J. Kusner,et al.  Grammar Variational Autoencoder , 2017, ICML.

[22]  Petra Schneider,et al.  Generative Recurrent Networks for De Novo Drug Design , 2017, Molecular informatics.

[23]  Klaus-Robert Müller,et al.  SchNet: A continuous-filter convolutional neural network for modeling quantum interactions , 2017, NIPS.

[24]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[25]  R. Kondor,et al.  On representing chemical environments , 2012, 1209.3140.

[26]  K. Müller,et al.  Fast and accurate modeling of molecular atomization energies with machine learning. , 2011, Physical review letters.

[27]  Frank Neese,et al.  The ORCA program system , 2012 .

[28]  Kristof T. Schütt,et al.  How to represent crystal structures for machine learning: Towards fast prediction of electronic properties , 2013, 1307.1266.

[29]  F. Weigend,et al.  Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. , 2005, Physical chemistry chemical physics : PCCP.

[30]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.

[31]  George E. Dahl,et al.  Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error. , 2017, Journal of chemical theory and computation.

[32]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[33]  K. Müller,et al.  Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space , 2015, The journal of physical chemistry letters.