Generative Network Complex for the Automated Generation of Drug-like Molecules

Current drug discovery is expensive and time-consuming. It remains a challenging task to create a wide variety of novel compounds with desirable pharmacological properties and cheaply available to low-income people. In this work, we develop a generative network complex (GNC) to generate new drug-like molecules based on the multi-property optimization via the gradient descent in the latent space of an autoencoder. In our GNC, both multiple chemical properties and similarity scores are optimized to generate and predict drug-like molecules with desired chemical properties. To further validate the reliability of the predictions, these molecules are reevaluated and screened by independent 2D fingerprint-based predictors to come up with a few hundreds of new drug candidates. As a demonstration, we apply our GNC to generate a large number of new BACE1 inhibitors, as well as thousands of novel alternative drug candidates for eight existing market drugs, including Ceritinib, Ribociclib, Acalabrutinib, Idelalisib, Dabrafenib, Macimorelin, Enzalutamide, and Panobinostat.

[1]  Frank Noé,et al.  Learning Continuous and Data-Driven Molecular Descriptors by Translating Equivalent Chemical Representations , 2018 .

[2]  Duc Duy Nguyen,et al.  Are 2D fingerprints still valuable for drug discovery? , 2020, Physical chemistry chemical physics : PCCP.

[3]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[4]  Maria Laura Bolognesi,et al.  BACE-1 Inhibitors: From Recent Single-Target Molecules to Multitarget Compounds for Alzheimer's Disease. , 2017, Journal of medicinal chemistry.

[5]  A. Leo,et al.  Partition coefficients and their uses , 1971 .

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  D. Bojanic,et al.  Impact of high-throughput screening in biomedical research , 2011, Nature Reviews Drug Discovery.

[8]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[9]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[10]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[11]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12]  Ping Chen,et al.  Spectrum and Degree of CDK Drug Interactions Predicts Clinical Performance , 2016, Molecular Cancer Therapeutics.

[13]  Lemont B. Kier,et al.  Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence State Information , 1995, J. Chem. Inf. Comput. Sci..

[14]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[15]  Kyunghyun Cho,et al.  Conditional molecular design with deep generative models , 2018, J. Chem. Inf. Model..

[16]  Duc Duy Nguyen,et al.  Generative network complex (GNC) for drug discovery , 2019, Commun. Inf. Syst..

[17]  L. Di,et al.  In vitro solubility assays in drug discovery. , 2008, Current drug metabolism.

[18]  Sergey Nikolenko,et al.  druGAN: An Advanced Generative Adversarial Autoencoder Model for de Novo Generation of New Molecules with Desired Molecular Properties in Silico. , 2017, Molecular pharmaceutics.

[19]  Michael Schroeder,et al.  PLIP: fully automated protein–ligand interaction profiler , 2015, Nucleic Acids Res..

[20]  Saeed Alqahtani,et al.  In silico ADME-Tox modeling: progress and prospects , 2017, Expert opinion on drug metabolism & toxicology.

[21]  Alex Zhavoronkov,et al.  Applications of Deep Learning in Biomedicine. , 2016, Molecular pharmaceutics.

[22]  I M Kapetanovic,et al.  Computer-aided drug discovery and development (CADDD): in silico-chemico-biological approach. , 2008, Chemico-biological interactions.

[23]  H. John Smith,et al.  Textbook of Drug Design and Discovery , 2002 .

[24]  Igor V. Tetko,et al.  Virtual Computational Chemistry Laboratory – Design and Description , 2005, J. Comput. Aided Mol. Des..

[25]  David E. Shaw,et al.  PHASE: a new engine for pharmacophore perception, 3D QSAR model development, and 3D database screening: 1. Methodology and preliminary results , 2006, J. Comput. Aided Mol. Des..

[26]  R. M. Owen,et al.  An analysis of the attrition of drug candidates from four major pharmaceutical companies , 2015, Nature Reviews Drug Discovery.

[27]  Manuel C. Peitsch,et al.  SWISS-MODEL: an automated protein homology-modeling server , 2003, Nucleic Acids Res..

[28]  Dragos Horvath,et al.  De Novo Molecular Design by Combining Deep Autoencoder Recurrent Neural Networks with Generative Topographic Mapping , 2019, J. Chem. Inf. Model..

[29]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[30]  S. Rees,et al.  Principles of early drug discovery , 2011, British journal of pharmacology.

[31]  Olexandr Isayev,et al.  Deep reinforcement learning for de novo drug design , 2017, Science Advances.

[32]  R. W. Hansen,et al.  Journal of Health Economics , 2016 .

[33]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[34]  Gianni De Fabritiis,et al.  Shape-Based Generative Modeling for de Novo Drug Design , 2019, J. Chem. Inf. Model..

[35]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[36]  P. Wong,et al.  The β-Secretase Enzyme BACE in Health and Alzheimer's Disease: Regulation, Cell Biology, Function, and Therapeutic Potential , 2009, The Journal of Neuroscience.

[37]  Makoto Nishio,et al.  The ALK inhibitor ceritinib overcomes crizotinib resistance in non-small cell lung cancer. , 2014, Cancer discovery.

[38]  Li Li,et al.  Optimization of Molecules via Deep Reinforcement Learning , 2018, Scientific Reports.

[39]  M. Markowicz,et al.  Adaptation of High-Throughput Screening in Drug Discovery—Toxicological Screening Tests , 2011, International journal of molecular sciences.

[40]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[41]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[42]  K V Balakin,et al.  Compound library design for target families. , 2009, Methods in molecular biology.

[43]  Frank Noé,et al.  Efficient multi-objective molecular optimization in a continuous latent space† †Electronic supplementary information (ESI) available: Details of the desirability scaling functions, high resolution figures and detailed results of the GuacaMol benchmark. See DOI: 10.1039/c9sc01928f , 2019, Chemical science.

[44]  Laura Revel,et al.  Overcoming the obstacles in the pharma/biotech industry: 2008 update. , 2009, Drug news & perspectives.

[45]  Mahmud Tareq Hassan Khan,et al.  Predictions of the ADMET properties of candidate drug molecules utilizing different QSAR/QSPR modelling approaches. , 2010, Current drug metabolism.

[46]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[47]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[48]  I. Leung,et al.  Protein-Directed Dynamic Combinatorial Chemistry: A Guide to Protein Ligand and Inhibitor Discovery , 2016, Molecules.

[49]  Thomas Blaschke,et al.  Molecular de-novo design through deep reinforcement learning , 2017, Journal of Cheminformatics.

[50]  Byunghan Lee,et al.  Deep learning in bioinformatics , 2016, Briefings Bioinform..

[51]  Gisbert Schneider,et al.  Computer-based de novo design of drug-like molecules , 2005, Nature Reviews Drug Discovery.