DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design

Designing 3D ligands within a target binding site is a fundamental task in drug discovery. Existing structured-based drug design methods treat all ligand atoms equally, which ignores different roles of atoms in the ligand for drug design and can be less efficient for exploring the large drug-like molecule space. In this paper, inspired by the convention in pharmaceutical practice, we decompose the ligand molecule into two parts, namely arms and scaffold, and propose a new diffusion model, D ECOMP D IFF , with decomposed priors over arms and scaffold. In order to facilitate the decomposed generation and improve the properties of the generated molecules, we incorporate both bond diffusion in the model and additional validity guidance in the sampling phase. Extensive experiments on CrossDocked2020 show that our approach achieves state-of-the-art performance in generating high-affinity molecules while maintaining proper molecular properties and conformational stability, with up to − 8 . 39 Avg. Vina Dock score and 24 . 5% Success Rate. The code is provided at https://github. com/bytedance/DecompDiff

[1]  Wesley Wei Qian,et al.  3D Equivariant Diffusion for Target-Aware Molecule Generation and Affinity Prediction , 2023, ICLR.

[2]  Stan Z. Li,et al.  DiffBP: Generative Diffusion of 3D Molecules for Target Protein Binding , 2022, ArXiv.

[3]  T. Blundell,et al.  Structure-based Drug Design with Equivariant Diffusion Models , 2022, ArXiv.

[4]  Victor Garcia Satorras,et al.  Equivariant 3D-Conditional Diffusion Models for Molecular Linker Design , 2022, ArXiv.

[5]  Connor W. Coley,et al.  Equivariant Shape-Conditioned Generation of 3D Molecules for Ligand-Based Drug Design , 2022, ICLR.

[6]  V. Cevher,et al.  DiGress: Discrete Denoising diffusion for graph generation , 2022, ICLR.

[7]  Xinyu Dai,et al.  Zero-Shot 3D Drug Design by Sketching and Generating , 2022, NeurIPS.

[8]  Jonathan Ho Classifier-Free Diffusion Guidance , 2022, ArXiv.

[9]  Xiang Lisa Li,et al.  Diffusion-LM Improves Controllable Text Generation , 2022, NeurIPS.

[10]  Shitong Luo,et al.  Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets , 2022, ICML.

[11]  Jianzhu Ma,et al.  3DLinker: An E(3) Equivariant Variational Autoencoder for Molecular Linker Design , 2022, ICML.

[12]  Shuiwang Ji,et al.  Generating 3D Molecules for Target Protein Binding , 2022, ICML.

[13]  Prafulla Dhariwal,et al.  Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.

[14]  Victor Garcia Satorras,et al.  Equivariant Diffusion for Molecule Generation in 3D , 2022, ICML.

[15]  Shitong Luo A 3D Generative Model for Structure-Based Drug Design , 2022, NeurIPS.

[16]  S. Ermon,et al.  GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation , 2022, ICLR.

[17]  Prafulla Dhariwal,et al.  GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models , 2021, ICML.

[18]  T. Jaakkola,et al.  Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking , 2021, ICLR.

[19]  David Ryan Koes,et al.  Generating 3D molecules conditional on receptor binding sites with deep generative models , 2021, Chemical science.

[20]  Diogo Santos-Martins,et al.  AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings , 2021, J. Chem. Inf. Model..

[21]  Prafulla Dhariwal,et al.  Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.

[22]  Charlotte M. Deane,et al.  Deep generative design with 3D pharmacophoric constraints , 2021, bioRxiv.

[23]  Luhua Lai,et al.  Structure-based de novo drug design using 3D deep generative models , 2021, Chemical science.

[24]  Weinan Zhang,et al.  MARS: Markov Molecular Sampling for Multi-objective Drug Discovery , 2021, ICLR.

[25]  Max Welling,et al.  E(n) Equivariant Graph Neural Networks , 2021, ICML.

[26]  Prafulla Dhariwal,et al.  Improved Denoising Diffusion Probabilistic Models , 2021, ICML.

[27]  Didrik Nielsen,et al.  Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions , 2021, NeurIPS.

[28]  M. Bronstein,et al.  Fast end-to-end learning on protein surfaces , 2020, bioRxiv.

[29]  Bryan Catanzaro,et al.  DiffWave: A Versatile Diffusion Model for Audio Synthesis , 2020, ICLR.

[30]  Pieter Abbeel,et al.  Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[31]  Fabian B. Fuchs,et al.  SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks , 2020, NeurIPS.

[32]  Shuangjia Zheng,et al.  SyntaLinker: Automatic Fragment Linking with Deep Conditional Transformer Neural Networks , 2020 .

[33]  Charlotte M. Deane,et al.  Deep Generative Models for 3D Linker Design , 2020, J. Chem. Inf. Model..

[34]  Stephan Günnemann,et al.  Directional Message Passing for Molecular Graphs , 2020, ICLR.

[35]  Regina Barzilay,et al.  Multi-Objective Molecule Generation using Interpretable Substructures , 2020, ICML.

[36]  Joseph Katigbak,et al.  AlphaSpace 2.0: Representing Concave Biomolecular Surfaces Using β-Clusters , 2020, J. Chem. Inf. Model..

[37]  J. Reymond,et al.  SMILES-based deep generative scaffold decorator for de-novo drug design , 2020, Journal of Cheminformatics.

[38]  Yibo Li,et al.  DeepScaffold: a comprehensive tool for scaffold-based de novo drug discovery using deep learning , 2019, J. Chem. Inf. Model..

[39]  Yang Song,et al.  Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.

[40]  Jaechang Lim,et al.  Scaffold-based molecular design with a graph generative model , 2019, Chemical science.

[41]  Regina Barzilay,et al.  Analyzing Learned Molecular Representations for Property Prediction , 2019, J. Chem. Inf. Model..

[42]  Li Li,et al.  Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds , 2018, ArXiv.

[43]  Tudor I. Oprea,et al.  DrugCentral: online drug compendium , 2016, Nucleic Acids Res..

[44]  Cheng Wang,et al.  AlphaSpace: Fragment-Centric Topographical Mapping To Target Protein–Protein Interaction Interfaces , 2015, J. Chem. Inf. Model..

[45]  Surya Ganguli,et al.  Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.

[46]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[47]  P. Wipf,et al.  Stochastic voyages into uncharted chemical space produce a representative library of all possible drug-like compounds. , 2013, Journal of the American Chemical Society.

[48]  G. V. Paolini,et al.  Quantifying the chemical beauty of drugs. , 2012, Nature chemistry.

[49]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[50]  Peter Ertl,et al.  Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions , 2009, J. Cheminformatics.

[51]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[52]  Matthias Rarey,et al.  On the Art of Compiling and Using 'Drug‐Like' Chemical Fragment Spaces , 2008, ChemMedChem.

[53]  Phillip Jeffrey,et al.  The Practice of Medicinal Chemistry , 2004 .

[54]  A. Anderson The process of structure-based drug design. , 2003, Chemistry & biology.

[55]  Schmid,et al.  "Scaffold-Hopping" by Topological Pharmacophore Search: A Contribution to Virtual Screening. , 1999, Angewandte Chemie.

[56]  Shuiwang Ji,et al.  An Autoregressive Flow Model for 3D Molecular Geometry Generation from Scratch , 2022, ICLR.

[57]  Wesley Wei Qian,et al.  Energy-Inspired Molecular Conformation Optimization , 2022, ICLR.

[58]  Tao Qin,et al.  PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Driven Adaptive Prior , 2021, ArXiv.

[59]  Paul G. Francoeur,et al.  Three-Dimensional Convolutional Neural Networks and a Cross-Docked Data Set for Structure-Based Drug Design , 2020, J. Chem. Inf. Model..

[60]  H. Berman,et al.  Electronic Reprint Biological Crystallography the Protein Data Bank Biological Crystallography the Protein Data Bank , 2022 .