3DLinker: An E(3) Equivariant Variational Autoencoder for Molecular Linker Design

Deep learning has achieved tremendous success in designing novel chemical compounds with desirable pharmaceutical properties. In this work, we focus on a new type of drug design problem -- generating a small"linker"to physically attach two independent molecules with their distinct functions. The main computational challenges include: 1) the generation of linkers is conditional on the two given molecules, in contrast to generating full molecules from scratch in previous works; 2) linkers heavily depend on the anchor atoms of the two molecules to be connected, which are not known beforehand; 3) 3D structures and orientations of the molecules need to be considered to avoid atom clashes, for which equivariance to E(3) group are necessary. To address these problems, we propose a conditional generative model, named 3DLinker, which is able to predict anchor atoms and jointly generate linker graphs and their 3D structures based on an E(3) equivariant graph variational autoencoder. So far as we know, there are no previous models that could achieve this task. We compare our model with multiple conditional generative models modified from other molecular design tasks and find that our model has a significantly higher rate in recovering molecular graphs, and more importantly, accurately predicting the 3D coordinates of all the atoms.

[1]  Shitong Luo A 3D Generative Model for Structure-Based Drug Design , 2022, NeurIPS.

[2]  Shuiwang Ji,et al.  Molecule3D: A Benchmark for Predicting 3D Geometries from Molecular Graphs , 2021, ArXiv.

[3]  Jian Tang,et al.  An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming , 2021, ICML.

[4]  Charlotte M. Deane,et al.  Deep generative design with 3D pharmacophoric constraints , 2021, bioRxiv.

[5]  Andrea Tagliasacchi,et al.  Vector Neurons: A General Framework for SO(3)-Equivariant Networks , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Max Welling,et al.  E(n) Equivariant Graph Neural Networks , 2021, ICML.

[7]  Shuiwang Ji,et al.  Spherical Message Passing for 3D Graph Networks , 2021, ArXiv.

[8]  Michael Gastegger,et al.  Equivariant message passing for the prediction of tensorial properties and molecular spectra , 2021, ICML.

[9]  Jos'e Miguel Hern'andez-Lobato,et al.  Symmetry-Aware Actor-Critic for 3D Molecular Design , 2020, ICLR.

[10]  Raphael J. L. Townshend,et al.  Learning from Protein Structure with Geometric Vector Perceptrons , 2020, ICLR.

[11]  Ronald Yu,et al.  A Tutorial on VAEs: From Bayes' Rule to Lossless Compression , 2020, ArXiv.

[12]  Fabian B. Fuchs,et al.  SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks , 2020, NeurIPS.

[13]  Shuangjia Zheng,et al.  SyntaLinker: Automatic Fragment Linking with Deep Conditional Transformer Neural Networks , 2020 .

[14]  Charlotte M. Deane,et al.  Deep Generative Models for 3D Linker Design , 2020, J. Chem. Inf. Model..

[15]  Stephan Günnemann,et al.  Directional Message Passing for Molecular Graphs , 2020, ICLR.

[16]  José Miguel Hernández-Lobato,et al.  Reinforcement Learning for Molecular Design Guided by Quantum Mechanics , 2020, ICML.

[17]  T. Jaakkola,et al.  Hierarchical Generation of Molecular Graphs using Structural Motifs , 2020, ICML.

[18]  Weinan Zhang,et al.  GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation , 2020, ICLR.

[19]  José Miguel Hernández-Lobato,et al.  A Generative Model for Molecular Distance Geometry , 2019, ICML.

[20]  Michael Gastegger,et al.  Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules , 2019, NeurIPS.

[21]  Motoki Abe,et al.  GraphNVP: An Invertible Flow Model for Generating Molecular Graphs , 2019, ArXiv.

[22]  Marwin H. S. Segler,et al.  GuacaMol: Benchmarking Models for De Novo Molecular Design , 2018, J. Chem. Inf. Model..

[23]  Cao Xiao,et al.  Constrained Generation of Semantically Valid Graphs via Regularizing Variational Autoencoders , 2018, NeurIPS.

[24]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[25]  Qi Liu,et al.  Constrained Graph Variational Autoencoders for Molecule Design , 2018, NeurIPS.

[26]  Jure Leskovec,et al.  GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models , 2018, ICML.

[27]  Li Li,et al.  Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds , 2018, ArXiv.

[28]  Steven Skiena,et al.  Syntax-Directed Variational Autoencoder for Structured Data , 2018, ICLR.

[29]  Razvan Pascanu,et al.  Learning Deep Generative Models of Graphs , 2018, ICLR 2018.

[30]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[31]  Nikos Komodakis,et al.  GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders , 2018, ICANN.

[32]  Klaus-Robert Müller,et al.  SchNet: A continuous-filter convolutional neural network for modeling quantum interactions , 2017, NIPS.

[33]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[34]  Matt J. Kusner,et al.  Grammar Variational Autoencoder , 2017, ICML.

[35]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[36]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[37]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[38]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[39]  John J. Irwin,et al.  ZINC 15 – Ligand Discovery for Everyone , 2015, J. Chem. Inf. Model..

[40]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[41]  Alexandre Varnek,et al.  Estimation of the size of drug-like chemical space based on GDB-17 data , 2013, Journal of Computer-Aided Molecular Design.

[42]  G. V. Paolini,et al.  Quantifying the chemical beauty of drugs. , 2012, Nature chemistry.

[43]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[44]  Mark Whittaker,et al.  Compound Design by Fragment‐Linking , 2011, Molecular informatics.

[45]  E. Choi,et al.  Impact of linker length on the activity of PROTACs. , 2011, Molecular bioSystems.

[46]  J. Baell,et al.  New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. , 2010, Journal of medicinal chemistry.

[47]  Jameed Hussain,et al.  Computationally Efficient Algorithm to Identify Matched Molecular Pairs (MMPs) in Large Data Sets , 2010, J. Chem. Inf. Model..

[48]  Peter Ertl,et al.  Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions , 2009, J. Cheminformatics.

[49]  E. Villaseñor Introduction to Quantum Mechanics , 2008, Nature.

[50]  Santosh Putta,et al.  Feature-map vectors: a new class of informative descriptors for computational drug discovery , 2007, J. Comput. Aided Mol. Des..

[51]  Gregory A Landrum,et al.  Conformation mining: an algorithm for finding biologically relevant conformations. , 2005, Journal of medicinal chemistry.

[52]  John F. Kolen,et al.  Field Guide to Dynamical Recurrent Networks , 2001 .

[53]  Pat Langley,et al.  Crafting Papers on Machine Learning , 2000, ICML.

[54]  Y. Bengio,et al.  Learning Neural Generative Dynamics for Molecular Conformation Generation , 2021, ICLR.

[55]  Max Welling,et al.  E(n) Equivariant Normalizing Flows for Molecule Generation in 3D , 2021, ArXiv.

[56]  Anthony E. Klon,et al.  Fragment-Based Methods in Drug Discovery , 2015, Methods in Molecular Biology.