Large-scale ligand-based virtual screening for SARS-CoV-2 inhibitors using deep neural networks

Due to the current severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, there is an urgent need for novel therapies and drugs. We conducted a large-scale virtual screening for small molecules that are potential CoV-2 inhibitors. To this end, we utilized ChemAI, a deep neural network trained on more than 220M data points across 3.6M molecules from three public drug-discovery databases. With ChemAI, we screened and ranked one billion molecules from the ZINC database for favourable effects against CoV-2. We then reduced the result to the 30,000 top-ranked compounds, which are readily accessible and purchasable via the ZINC database. We provide these top-ranked compounds as a library for further screening with bioassays at https://github.com/ml-jku/sars-cov-inhibitors-chemai.

[1]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[2]  Anne E Carpenter,et al.  Repurposing High-Throughput Image Assays Enables Biological Activity Prediction for Drug Discovery. , 2018, Cell chemical biology.

[3]  T. Ashburn,et al.  Drug repositioning: identifying and developing new uses for existing drugs , 2004, Nature Reviews Drug Discovery.

[4]  Chao Liu,et al.  Potential Inhibitors Targeting RNA-Dependent RNA Polymerase Activity (NSP12) of SARS-CoV-2 , 2020 .

[5]  Jun Zhang,et al.  Virtual Screening and Molecular Dynamics on Blockage of Key Drug Targets as Treatment for COVID-19 Caused by SARS-CoV-2 , 2020 .

[6]  Hualiang Jiang,et al.  Structure of Mpro from COVID-19 virus and discovery of its inhibitors , 2020, bioRxiv.

[7]  Sepp Hochreiter,et al.  Accurate Prediction of Biological Assays with High-Throughput Microscopy Images and Convolutional Networks , 2019, J. Chem. Inf. Model..

[8]  Lixia Chen,et al.  Analysis of therapeutic targets for SARS-CoV-2 and discovery of potential drugs by computational methods , 2020, Acta Pharmaceutica Sinica B.

[9]  Heidi Ledford One drug, two targets , 2009 .

[10]  An Hong,et al.  Virtual screening of approved clinic drugs with main protease (3CLpro) reveals potential inhibitory effects on SARS-CoV-2 , 2020, Journal of biomolecular structure & dynamics.

[11]  Günter Klambauer,et al.  DeepTox: Toxicity Prediction using Deep Learning , 2016, Front. Environ. Sci..

[12]  Joanna Collison Two targets are better than one , 2019, Nature Reviews Rheumatology.

[13]  David S. Wishart,et al.  DrugBank 5.0: a major update to the DrugBank database for 2018 , 2017, Nucleic Acids Res..

[14]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[15]  Sepp Hochreiter,et al.  Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery , 2018, J. Chem. Inf. Model..

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  M. Farooq,et al.  In Silico Discovery of Novel Inhibitors Against Main Protease (Mpro) of SARS-CoV-2 Using Pharmacophore and Molecular Docking Based Virtual Screening from ZINC Database , 2020 .

[18]  John J. Irwin,et al.  ZINC 15 – Ligand Discovery for Everyone , 2015, J. Chem. Inf. Model..

[19]  Joseph Gomes,et al.  MoleculeNet: a benchmark for molecular machine learning† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02664a , 2017, Chemical science.

[20]  Piero Procacci,et al.  Inhibition of the Main Protease 3CL-pro of the Coronavirus Disease 19 via Structure-Based Ligand Design and Molecular Modeling , 2020 .

[21]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[22]  Friedrich Rippmann,et al.  Interpretable Deep Learning in Drug Discovery , 2019, Explainable AI.

[23]  Shi Yulong,et al.  D3Similarity: A Ligand-Based Approach for Predicting Drug Targets and for Virtual Screening of Active Compounds Against COVID-19 , 2020 .

[24]  Hugo Ceulemans,et al.  Large-scale comparison of machine learning methods for drug target prediction on ChEMBL† †Electronic supplementary information (ESI) available: Overview, Data Collection and Clustering, Methods, Results, Appendix. See DOI: 10.1039/c8sc00148k , 2018, Chemical science.

[25]  Matthias Rarey,et al.  Machine Learning in Drug Discovery , 2018, Journal of Chemical Information and Modeling.

[26]  K. Tennekoon,et al.  Virtual Screening of Inhibitors Against Spike Glycoprotein of 2019 Novel Corona Virus: A Drug Repurposing Approach , 2020 .

[27]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[28]  Liangzhong Lim,et al.  Identification of a Zika NS2B-NS3pro pocket susceptible to allosteric inhibition by small molecules including qucertin rich in edible plants , 2016, bioRxiv.

[29]  Markus A. Lill,et al.  Inhibitors for Novel Coronavirus Protease Identified by Virtual Screening of 687 Million Compounds , 2020 .

[30]  Eli Reuveni,et al.  Virtual screening for potential inhibitors of Mcl-1 conformations sampled by normal modes, molecular dynamics, and nuclear magnetic resonance , 2017, Drug design, development and therapy.

[31]  F. Cheng,et al.  Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2 , 2020, Cell Discovery.

[32]  Artem Cherkasov,et al.  Rapid Identification of Potential Inhibitors of SARS‐CoV‐2 Main Protease by Deep Docking of 1.3 Billion Compounds , 2020, Molecular informatics.

[33]  David A. Scott,et al.  An open-source drug discovery platform enables ultra-large virtual screens , 2020, Nature.

[34]  Kwok-Yin Wong,et al.  Prediction of the SARS-CoV-2 (2019-nCoV) 3C-like protease (3CL pro) structure: virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates. , 2020, F1000Research.