Structure-Based de Novo Molecular Generator Combined with Artificial Intelligence and Docking Simulations

Recently, molecular generation models based on deep learning have attracted significant attention in drug discovery. However, most existing molecular generation models have serious limitations in the context of drug design wherein they do not sufficiently consider the effect of the three-dimensional (3D) structure of the target protein in the generation process. In this study, we developed a new deep learning-based molecular generator, SBMolGen, that integrates a recurrent neural network, a Monte Carlo tree search, and docking simulations. The results of an evaluation using four target proteins (two kinases and two G protein-coupled receptors) showed that the generated molecules had a better binding affinity score (docking score) than the known active compounds, and the generated molecules possessed a broader chemical space distribution. SBMolGen not only generates novel binding active molecules but also presents 3D docking poses with target proteins, which will be useful in subsequent drug design. The code is available at https://github.com/clinfo/SBMolGen.

[1]  Xavier Barril,et al.  rDock: A Fast, Versatile and Open Source Program for Docking Ligands to Proteins and Nucleic Acids , 2014, PLoS Comput. Biol..

[2]  E. Lionta,et al.  Structure-Based Virtual Screening for Drug Discovery: Principles, Applications and Recent Advances , 2014, Current topics in medicinal chemistry.

[3]  R. Dror,et al.  Improved side-chain torsion potentials for the Amber ff99SB protein force field , 2010, Proteins.

[4]  Mary Adams,et al.  4-Amino-6-arylamino-pyrimidine-5-carbaldehyde hydrazones as potent ErbB-2/EGFR dual kinase inhibitors. , 2008, Bioorganic & medicinal chemistry letters.

[5]  Aaron T. Frank,et al.  Navigating Chemical Space By Interfacing Generative Artificial Intelligence and Molecular Docking , 2020, bioRxiv.

[6]  Michael M. Mysinger,et al.  Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking , 2012, Journal of medicinal chemistry.

[7]  Alexander Heifetz,et al.  Rapid and accurate assessment of GPCR–ligand interactions Using the fragment molecular orbital‐based density‐functional tight‐binding method , 2017, J. Comput. Chem..

[8]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[9]  Takeshi Ishikawa,et al.  Theoretical study of the prion protein based on the fragment molecular orbital method , 2009, J. Comput. Chem..

[10]  Jacques Boitreaud,et al.  OptiMol: Optimization of binding affinities in chemical space for drug discovery. , 2020, Journal of chemical information and modeling.

[11]  R. Stevens,et al.  The 2.6 Angstrom Crystal Structure of a Human A2A Adenosine Receptor Bound to an Antagonist , 2008, Science.

[12]  Takeshi Ishikawa,et al.  Fragment molecular orbital calculation using the RI-MP2 method , 2009 .

[13]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..

[14]  Alán Aspuru-Guzik,et al.  Deep learning enables rapid identification of potent DDR1 kinase inhibitors , 2019, Nature Biotechnology.

[15]  Alexandre Varnek,et al.  Estimation of the size of drug-like chemical space based on GDB-17 data , 2013, Journal of Computer-Aided Molecular Design.

[16]  Sereina Riniker,et al.  Better Informed Distance Geometry: Using What We Know To Improve Conformation Generation , 2015, J. Chem. Inf. Model..

[17]  Kaifu Gao,et al.  Generative Network Complex for the Automated Generation of Drug-like Molecules , 2020, J. Chem. Inf. Model..

[18]  Sangdun Choi,et al.  A Structure-Based Drug Discovery Paradigm , 2019, International journal of molecular sciences.

[19]  Jianfeng Pei,et al.  Deep learning for molecular generation. , 2019, Future medicinal chemistry.

[20]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[21]  Koji Tsuda,et al.  ChemTS: an efficient python library for de novo molecular generation , 2017, Science and technology of advanced materials.

[22]  T. H. Dunning Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen , 1989 .

[23]  Dragos Horvath,et al.  De Novo Molecular Design by Combining Deep Autoencoder Recurrent Neural Networks with Generative Topographic Mapping , 2019, J. Chem. Inf. Model..

[24]  Berk Hess,et al.  GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers , 2015 .

[25]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[26]  Mark Whittaker,et al.  Prediction of cyclin-dependent kinase 2 inhibitor potency using the fragment molecular orbital method , 2011, J. Cheminformatics.

[27]  Masahiko Nakatsui,et al.  The Effect of Conformational Flexibility on Binding Free Energy Estimation between Kinases and Their Inhibitors , 2016, J. Chem. Inf. Model..

[28]  K. Tsuda,et al.  NMR-TS: de novo molecule identification from NMR spectra , 2020, Science and technology of advanced materials.

[29]  Peter Ertl,et al.  Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions , 2009, J. Cheminformatics.

[30]  Junmei Wang,et al.  Development and testing of a general amber force field , 2004, J. Comput. Chem..

[31]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[32]  Benjamin A. Shoemaker,et al.  PubChem in 2021: new data content and improved web interfaces , 2020, Nucleic Acids Res..

[33]  G. V. Paolini,et al.  Quantifying the chemical beauty of drugs. , 2012, Nature chemistry.

[34]  Michael J. Bodkin,et al.  The Fragment Molecular Orbital Method Reveals New Insight into the Chemical Nature of GPCR-Ligand Interactions , 2016, J. Chem. Inf. Model..

[35]  Koji Tsuda,et al.  Population-based de novo molecule generation, using grammatical evolution , 2018, 1804.02134.

[36]  R. Abagyan,et al.  Conserved binding mode of human beta2 adrenergic receptor inverse agonists and antagonist revealed by X-ray crystallography. , 2010, Journal of the American Chemical Society.

[37]  W. L. Jorgensen,et al.  Comparison of simple potential functions for simulating liquid water , 1983 .

[38]  C. Grebner,et al.  Automated De-Novo Design in Medicinal Chemistry: Which Types of Chemistry Does a Generative Neural Network Learn? , 2020, Journal of medicinal chemistry.