A Generative Model for Molecular Distance Geometry

Great computational effort is invested in generating equilibrium states for molecular systems using, for example, Markov chain Monte Carlo. We present a probabilistic model that generates statistically independent samples for molecules from their graph representations. Our model learns a low-dimensional manifold that preserves the geometry of local atomic neighborhoods through a principled learning representation that is based on Euclidean distance geometry. In a new benchmark for molecular conformation generation, we show experimentally that our generative model achieves state-of-the-art accuracy. Finally, we show how to use our model as a proposal distribution in an importance sampling scheme to compute molecular properties.

[1]  F. Weigend Accurate Coulomb-fitting basis sets for H to Rn. , 2006, Physical chemistry chemical physics : PCCP.

[2]  Matt J. Kusner,et al.  Grammar Variational Autoencoder , 2017, ICML.

[3]  Sandeep Sharma,et al.  PySCF: the Python‐based simulations of chemistry framework , 2017, 1701.08223.

[4]  Lawrence K. Saul,et al.  Exploratory analysis and visualization of speech and music by locally linear embedding , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Matt J. Kusner,et al.  A Model to Search for Synthesizable Molecules , 2019, NeurIPS.

[6]  Matt J. Kusner,et al.  A Generative Model For Electron Paths , 2018, ICLR.

[7]  A. Cavalli,et al.  Role of Molecular Dynamics and Related Methods in Drug Discovery. , 2016, Journal of medicinal chemistry.

[8]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[9]  L. Sifre,et al.  A 7 D De novo structure prediction with deep learning based scoring , 2022 .

[10]  J. S. Dixon,et al.  Distance Geometry in Molecular Modeling , 2007 .

[11]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[12]  Kilian Q. Weinberger,et al.  Unsupervised Learning of Image Manifolds by Semidefinite Programming , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[13]  Debora S. Marks,et al.  Learning Protein Structure with a Differentiable Simulator , 2018, ICLR.

[14]  Klaus-Robert Müller,et al.  SchNet: A continuous-filter convolutional neural network for modeling quantum interactions , 2017, NIPS.

[15]  Vijay S. Pande,et al.  Molecular graph convolutions: moving beyond fingerprints , 2016, Journal of Computer-Aided Molecular Design.

[16]  Alexandre Tkatchenko,et al.  Quantum-chemical insights from deep tensor neural networks , 2016, Nature Communications.

[17]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[18]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[19]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[20]  Demis Hassabis,et al.  Improved protein structure prediction using potentials from deep learning , 2020, Nature.

[21]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22]  K. Burke,et al.  Rationale for mixing exact exchange with density functional approximations , 1996 .

[23]  A. J. Ballard,et al.  Exploiting the potential energy landscape to sample free energy , 2015 .

[24]  F. Weigend,et al.  Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. , 2005, Physical chemistry chemical physics : PCCP.

[25]  Sandeep Sharma,et al.  PySCF: the Python‐based simulations of chemistry framework , 2018 .

[26]  Hao Wu,et al.  Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning , 2018, Science.

[27]  Matthias Rarey,et al.  Torsion Library Reloaded: A New Version of Expert-Derived SMARTS Rules for Assessing Conformations of Small Molecules , 2016, J. Chem. Inf. Model..

[28]  Timothy F. Havel Distance Geometry: Theory, Algorithms, and Chemical Applications , 2002 .

[29]  F. Allen The Cambridge Structural Database: a quarter of a million crystal structures and rising. , 2002, Acta crystallographica. Section B, Structural science.

[30]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[31]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[32]  Steven Skiena,et al.  Syntax-Directed Variational Autoencoder for Structured Data , 2018, ICLR.

[33]  Qi Liu,et al.  Constrained Graph Variational Autoencoders for Molecule Design , 2018, NeurIPS.

[34]  Alexander D. MacKerell,et al.  Computational ligand-based rational design: Role of conformational sampling and force fields in model development. , 2011, MedChemComm.

[35]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[36]  John A. Keith,et al.  A sobering assessment of small-molecule force field methods for low energy conformer predictions , 2017, 1705.04308.

[37]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[38]  Mohammed AlQuraishi,et al.  End-to-end differentiable learning of protein structure , 2018, bioRxiv.

[39]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[40]  E. B. Andersen,et al.  Information Science and Statistics , 1986 .

[41]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[42]  Sereina Riniker,et al.  Better Informed Distance Geometry: Using What We Know To Improve Conformation Generation , 2015, J. Chem. Inf. Model..

[43]  Christine Peter,et al.  EncoderMap: Dimensionality Reduction and Generation of Molecule Conformations. , 2019, Journal of chemical theory and computation.

[44]  Kevin Liu,et al.  Conditional Variational Autoencoder for Neural Machine Translation , 2018, ArXiv.

[45]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[46]  Erik D. Demaine,et al.  The distance geometry of music , 2007, Comput. Geom..

[47]  Maria A Miteva,et al.  DG-AMMOS: A New tool to generate 3D conformation of small molecules using Distance Geometry and Automated Molecular Mechanics Optimization for in silico Screening , 2009, BMC chemical biology.

[48]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[49]  Elman Mansimov,et al.  Molecular Geometry Prediction using a Deep Generative Graph Neural Network , 2019, Scientific Reports.