Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models

The recent COVID-19 pandemic has highlighted the need for rapid therapeutic development for infectious diseases. To accelerate this process, we present a deep learning based generative modeling framework, CogMol, to design drug candidates specific to a given target protein sequence with high off-target selectivity. We augment this generative framework with an in silico screening process that accounts for toxicity, to lower the failure rate of the generated drug candidates in later stages of the drug development pipeline. We apply this framework to three relevant proteins of the SARS-CoV-2, the virus responsible for COVID-19, namely nonstructural protein 9 (NSP9) replicase, main protease, and the receptor-binding domain (RBD) of the S protein. Docking to the target proteins demonstrate the potential of these generated molecules as ligands. Structural similarity analyses further imply novelty of the generated molecules with respect to the training dataset as well as possible biological association of a number of generated molecules that might be of relevance to COVID-19 therapeutic design. While the validation of these molecules is underway, we release ∼ 3000 novel COVID-19 drug candidates generated using our framework.

[1]  Richard E. Turner,et al.  Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control , 2016, ICML.

[2]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[3]  David Hoksza,et al.  P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure , 2018, Journal of Cheminformatics.

[4]  Cao Xiao,et al.  Constrained Generation of Semantically Valid Graphs via Regularizing Variational Autoencoders , 2018, NeurIPS.

[5]  Bonnie Berger,et al.  Learning protein sequence embeddings using information from structure , 2019, ICLR.

[6]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[7]  Nikos Komodakis,et al.  GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders , 2018, ICANN.

[8]  David Hoksza,et al.  PrankWeb: web server for ligand binding-site prediction and visualization , 2019, bioRxiv.

[9]  Alexandre Varnek,et al.  Estimation of the size of drug-like chemical space based on GDB-17 data , 2013, Journal of Computer-Aided Molecular Design.

[10]  Olexandr Isayev,et al.  Deep reinforcement learning for de novo drug design , 2017, Science Advances.

[11]  Vijay S. Pande,et al.  MoleculeNet: a benchmark for molecular machine learning , 2017, Chemical science.

[12]  Evan Bolton,et al.  PubChem 2019 update: improved access to chemical data , 2018, Nucleic Acids Res..

[13]  Di Wu,et al.  DeepAffinity: Interpretable Deep Learning of Compound-Protein Affinity through Unified Recurrent and Convolutional Neural Networks , 2018, bioRxiv.

[14]  Petra Schneider,et al.  Generative Recurrent Networks for De Novo Drug Design , 2017, Molecular informatics.

[15]  Jin Woo Kim,et al.  Molecular generative model based on conditional variational autoencoder for de novo molecular design , 2018, Journal of Cheminformatics.

[16]  Tao Jiang,et al.  A maximum common substructure-based algorithm for searching and predicting drug-like compounds , 2008, ISMB.

[17]  Alán Aspuru-Guzik,et al.  Reinforced Adversarial Neural Computer for de Novo Molecular Design , 2018, J. Chem. Inf. Model..

[18]  Joseph Gomes,et al.  MoleculeNet: a benchmark for molecular machine learning† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02664a , 2017, Chemical science.

[19]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[20]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[21]  Nicola De Cao,et al.  MolGAN: An implicit generative model for small molecular graphs , 2018, ArXiv.

[22]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[23]  Wu Zhong,et al.  Hydroxychloroquine, a less toxic derivative of chloroquine, is effective in inhibiting SARS-CoV-2 infection in vitro , 2020, Cell Discovery.

[24]  Hiroshi Kajino,et al.  Molecular Hypergraph Grammar with its Application to Molecular Optimization , 2018, ICML.

[25]  Kyunghyun Cho,et al.  Conditional molecular design with deep generative models , 2018, J. Chem. Inf. Model..

[26]  Niloy Ganguly,et al.  NeVAE: A Deep Generative Model for Molecular Graphs , 2018, AAAI.

[27]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[28]  Thomas Blaschke,et al.  Application of Generative Autoencoder in De Novo Molecular Design , 2017, Molecular informatics.

[29]  Lukás Jendele,et al.  PrankWeb: a web server for ligand binding site prediction and visualization , 2019, Nucleic Acids Res..

[30]  Steven Skiena,et al.  Syntax-Directed Variational Autoencoder for Structured Data , 2018, ICLR.

[31]  Alán Aspuru-Guzik,et al.  Deep learning enables rapid identification of potent DDR1 kinase inhibitors , 2019, Nature Biotechnology.

[32]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[33]  Yibo Li,et al.  Multi-objective de novo drug design with conditional graph generative model , 2018, Journal of Cheminformatics.

[34]  Thomas Blaschke,et al.  Molecular de-novo design through deep reinforcement learning , 2017, Journal of Cheminformatics.

[35]  Matt J. Kusner,et al.  Grammar Variational Autoencoder , 2017, ICML.

[36]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[37]  Yingyu Liang,et al.  N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules , 2018, NeurIPS.

[38]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[39]  Li Li,et al.  Optimization of Molecules via Deep Reinforcement Learning , 2018, Scientific Reports.

[40]  Alán Aspuru-Guzik,et al.  Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models , 2017, ArXiv.

[41]  G. Herrler,et al.  SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor , 2020, Cell.

[42]  Razvan Pascanu,et al.  Learning Deep Generative Models of Graphs , 2018, ICLR 2018.

[43]  Alán Aspuru-Guzik,et al.  Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models , 2018, Frontiers in Pharmacology.

[44]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[45]  Michael K. Gilson,et al.  BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology , 2015, Nucleic Acids Res..

[46]  Christophe Meyer,et al.  The use of novel selectivity metrics in kinase research , 2017, BMC Bioinformatics.