MolGrow: A Graph Normalizing Flow for Hierarchical Molecular Generation

We propose a hierarchical normalizing flow model for generating molecular graphs. The model produces new molecular structures from a single-node graph by recursively splitting every node into two. All operations are invertible and can be used as plug-and-play modules. The hierarchical nature of the latent codes allows for precise changes in the resulting graph: perturbations in the top layer cause global structural changes, while perturbations in the consequent layers change the resulting molecule marginally. The proposed model outperforms existing generative graph models on the distribution learning task. We also show successful experiments on global and constrained optimization of chemical properties using latent codes of the model. Introduction Drug discovery is a challenging multidisciplinary task that combines domain knowledge in chemistry, biology, and computational science. Recent works demonstrated successful applications of machine learning to the drug development process, including synthesis planning (Segler, Preuss, and Waller 2018), protein folding (Senior et al. 2020), and hit discovery (Merk et al. 2018; Zhavoronkov et al. 2019). Advances in generative models enabled applications of machine learning to drug discovery, such as distribution learning and molecular property optimization. Distribution learning models train on a large dataset to produce novel compounds (Polykovskiy et al. 2020); property optimization models search the chemical space for molecules with desirable properties (Brown et al. 2019). Often researchers combine these tasks: they first train a distribution learning model and then use its latent codes to optimize molecular properties (Gómez-Bombarelli et al. 2018). For such models, proper latent codes are crucial for molecular space navigation. We propose a new graph generative model—MolGrow. Starting with a single node, it iteratively splits every node into two. Our model is invertible and maps molecular structures onto a fixed-size hierarchical manifold. Top levels of the manifold define global structure, while the bottom levels influence local features. Our contributions are three-fold: Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. • We propose a hierarchical normalizing flow model for generating molecular graphs. The model gradually increases graph size during sampling, starting with a single node; • We propose a fragment-oriented atom ordering that improves our model over commonly used breadth-first search ordering; • We apply our model to distribution learning and property optimization tasks. We report distribution learning metrics (Fréchet ChemNet distance and fragment distribution) for graph generative models besides providing standard uniqueness and validity measures. Background: Normalizing Flows Normalizing flows are generative models that transform a prior distribution p(z) into a target distribution p(x) by composing invertible functions fk: z = fK ◦ ... ◦ f2 ◦ f1(x), (1) x = f−1 1 ◦ ... ◦ f −1 K−1 ◦ f −1 K (z). (2) We call Equation 1 a forward path, and Equation 2 an inverse path. The prior distribution p(z) is often a standard multivariate normal distribution N (0, I). Such models are trained by maximizing training set log-likelihood using the change of variables formula: log p(x) = log p(z) + K ∑ i=1 log ∣∣∣∣det ( dhi dhi−1 )∣∣∣∣ , (3) where hi = fi(hi−1), h0 = x. To efficiently train the model and sample from it, inverse transformations and Jacobian determinants should be tractable and computationally efficient. In this work, we consider three types of layers: invertible linear layer, actnorm, and real-valued non-volume preserving transformation (RealNVP) (Dinh, Sohl-Dickstein, and Bengio 2017). We define these layers below for arbitrary d-dimensional vectors, and extend these layers for graphstructured data in the next section. We consider an invertible linear layer parameterization by Hoogeboom, Van Den Berg, and Welling (2019) that uses QR decomposition of a weight matrix: h = QR · z, where Q is an orthogonal matrix (Q = Q−1), and R is an upper The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

[1]  Nicola De Cao,et al.  MolGAN: An implicit generative model for small molecular graphs , 2018, ArXiv.

[2]  Alán Aspuru-Guzik,et al.  Deep learning enables rapid identification of potent DDR1 kinase inhibitors , 2019, Nature Biotechnology.

[3]  Stephen Dunn Smiles , 1932 .

[4]  Nikos Komodakis,et al.  GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders , 2018, ICANN.

[5]  Weinan Zhang,et al.  GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation , 2020, ICLR.

[6]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[7]  Marwin H. S. Segler,et al.  GuacaMol: Benchmarking Models for De Novo Molecular Design , 2018, J. Chem. Inf. Model..

[8]  Nikos Komodakis,et al.  Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Dmitry Vetrov,et al.  Deterministic Decoding for Discrete Data in Variational Autoencoders , 2020, AISTATS.

[10]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[11]  Mike Preuss,et al.  Planning chemical syntheses with deep neural networks and symbolic AI , 2017, Nature.

[12]  Fei Wang,et al.  MoFlow: An Invertible Flow Model for Generating Molecular Graphs , 2020, KDD.

[13]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[14]  Matt J. Kusner,et al.  Grammar Variational Autoencoder , 2017, ICML.

[15]  Alán Aspuru-Guzik,et al.  SELFIES: a robust representation of semantically constrained graphs with an example application in chemistry , 2019, ArXiv.

[16]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[17]  Wei Lu,et al.  Attention Guided Graph Convolutional Networks for Relation Extraction , 2019, ACL.

[18]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[19]  Andrey Kazennov,et al.  The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology , 2016, Oncotarget.

[20]  Jure Leskovec,et al.  GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models , 2018, ICML.

[21]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[22]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[23]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[24]  Esben Jannik Bjerrum,et al.  SMILES Enumeration as Data Augmentation for Neural Network Modeling of Molecules , 2017, ArXiv.

[25]  Alán Aspuru-Guzik,et al.  Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models , 2018, Frontiers in Pharmacology.

[26]  Matthias Rarey,et al.  On the Art of Compiling and Using 'Drug‐Like' Chemical Fragment Spaces , 2008, ChemMedChem.

[27]  Demis Hassabis,et al.  Improved protein structure prediction using potentials from deep learning , 2020, Nature.

[28]  Steven Skiena,et al.  Syntax-Directed Variational Autoencoder for Structured Data , 2018, ICLR.

[29]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[30]  Max Welling,et al.  Emerging Convolutions for Generative Normalizing Flows , 2019, ICML.

[31]  Noel M. O'Boyle,et al.  DeepSMILES: An Adaptation of SMILES for Use in Machine-Learning of Chemical Structures , 2018 .

[32]  Sepp Hochreiter,et al.  Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery , 2018, J. Chem. Inf. Model..

[33]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[34]  Kumar Krishna Agrawal,et al.  Discrete Flows: Invertible Generative Models of Discrete Data , 2019, DGS@ICLR.

[35]  Regina Barzilay,et al.  Hierarchical Generation of Molecular Graphs using Structural Motifs , 2020, ICML.

[36]  Samy Bengio,et al.  Order Matters: Sequence to sequence for sets , 2015, ICLR.

[37]  Motoki Abe,et al.  GraphNVP: An Invertible Flow Model for Generating Molecular Graphs , 2019, ArXiv.

[38]  Razvan Pascanu,et al.  Stabilizing Transformers for Reinforcement Learning , 2019, ICML.

[39]  Alán Aspuru-Guzik,et al.  Optimizing distributions over molecular space. An Objective-Reinforced Generative Adversarial Network for Inverse-design Chemistry (ORGANIC) , 2017 .

[40]  Gisbert Schneider,et al.  De Novo Design of Bioactive Small Molecules by Artificial Intelligence , 2018, Molecular informatics.

[41]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[42]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[43]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[44]  Olexandr Isayev,et al.  MolecularRNN: Generating realistic molecular graphs with optimized properties , 2019, ArXiv.

[45]  Tom White,et al.  Sampling Generative Networks: Notes on a Few Effective Techniques , 2016, ArXiv.