Learning to Extend Molecular Scaffolds with Structural Motifs

Recent advancements in deep learning-based modeling of molecules promise to accelerate in silico drug discovery. A plethora of generative models is available, building molecules either atom-by-atom and bond-by-bond or fragment-by-fragment. However, many drug discovery projects require a fixed scaffold to be present in the generated molecule, and incorporating that constraint has only recently been explored. Here, we propose MoLeR, a graph-based model that naturally supports scaffolds as initial seed of the generative procedure, which is possible because it is not conditioned on the generation history. Our experiments show that MoLeR performs comparably to state-of-the-art methods on unconstrained molecular optimization tasks, and outperforms them on scaffold-based tasks, while being an order of magnitude faster to train and sample from than existing approaches. Furthermore, we show the influence of a number of seemingly minor design choices on the overall performance.

[1]  Weinan Zhang,et al.  MARS: Markov Molecular Sampling for Multi-objective Drug Discovery , 2021, ICLR.

[2]  A. Tchagang,et al.  Deep Evolutionary Learning for Molecular Design , 2020, ArXiv.

[3]  Matt J. Kusner,et al.  Barking up the right tree: an approach to search over molecule synthesis DAGs , 2020, NeurIPS.

[4]  Agnieszka Pocha,et al.  Comparison of Atom Representations in Graph Neural Networks for Molecular Property Prediction , 2020, 2021 International Joint Conference on Neural Networks (IJCNN).

[5]  Minlie Huang,et al.  Reinforced Molecular Optimization with Neighborhood-Controlled Grammars , 2020, NeurIPS.

[6]  Marc Bianciotto,et al.  Scaffold-constrained molecular generation , 2020, J. Chem. Inf. Model..

[7]  G. Klambauer,et al.  Graph networks for molecular design , 2020, Mach. Learn. Sci. Technol..

[8]  Jinwoo Shin,et al.  Guiding Deep Molecular Optimization with Genetic Exploration , 2020, NeurIPS.

[9]  Stanislaw Jastrzebski,et al.  Molecule Edit Graph Attention Network: Modeling Chemical Reactions as Sequences of Graph Edits , 2020, J. Chem. Inf. Model..

[10]  S. Hochreiter,et al.  On Failure Modes of Molecule Generators and Optimizers , 2020 .

[11]  Dominique Beaini,et al.  Principal Neighbourhood Aggregation for Graph Nets , 2020, NeurIPS.

[12]  T. Jaakkola,et al.  Hierarchical Generation of Molecular Graphs using Structural Motifs , 2020, ICML.

[13]  J. Reymond,et al.  SMILES-based deep generative scaffold decorator for de-novo drug design , 2020, Journal of Cheminformatics.

[14]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[15]  Renjie Liao,et al.  Efficient Graph Generation with Graph Recurrent Attention Networks , 2019, NeurIPS.

[16]  Yibo Li,et al.  DeepScaffold: a comprehensive tool for scaffold-based de novo drug discovery using deep learning , 2019, J. Chem. Inf. Model..

[17]  Marc Brockschmidt,et al.  GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation , 2019, ICML.

[18]  Sang-Yeon Hwang,et al.  Scaffold-based molecular design using graph generative model , 2019, ArXiv.

[19]  Jonas Boström,et al.  Deep Reinforcement Learning for Multiparameter Optimization in de novo Drug Design , 2019, J. Chem. Inf. Model..

[20]  Djork-Arné Clevert,et al.  Efficient multi-objective molecular optimization in a continuous latent space , 2019, Chemical science.

[21]  Marwin H. S. Segler,et al.  GuacaMol: Benchmarking Models for De Novo Molecular Design , 2018, J. Chem. Inf. Model..

[22]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[23]  Yoshua Bengio,et al.  DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation , 2018, ArXiv.

[24]  Djork-Arné Clevert,et al.  Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations , 2018, Chemical science.

[25]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[26]  Nicola De Cao,et al.  MolGAN: An implicit generative model for small molecular graphs , 2018, ArXiv.

[27]  Qi Liu,et al.  Constrained Graph Variational Autoencoders for Molecule Design , 2018, NeurIPS.

[28]  Sepp Hochreiter,et al.  Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery , 2018, J. Chem. Inf. Model..

[29]  Razvan Pascanu,et al.  Learning Deep Generative Models of Graphs , 2018, ICLR 2018.

[30]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[31]  Nikos Komodakis,et al.  GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders , 2018, ICANN.

[32]  Yibo Li,et al.  Multi-objective de novo drug design with conditional graph generative model , 2018, Journal of Cheminformatics.

[33]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[34]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[35]  Christopher Burgess,et al.  beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.

[36]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[37]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[38]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[39]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[40]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[41]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[42]  Roger A. Sayle,et al.  Get Your Atoms in Order - An Open-Source Implementation of a Novel and Robust Molecular Canonicalization Algorithm , 2015, J. Chem. Inf. Model..

[43]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[44]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[45]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[46]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[47]  Ansgar Schuffenhauer,et al.  Computational methods for scaffold hopping , 2012 .

[48]  Matthias Rarey,et al.  On the Art of Compiling and Using 'Drug‐Like' Chemical Fragment Spaces , 2008, ChemMedChem.

[49]  Matthias Rarey,et al.  Feature trees: A new molecular similarity measure based on tree matching , 1998, J. Comput. Aided Mol. Des..

[50]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[51]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[52]  Stefan Wetzel,et al.  The Scaffold Tree - Visualization of the Scaffold Universe by Hierarchical Scaffold Classification , 2007, J. Chem. Inf. Model..