Deep Learning Based Drug Screening for Novel Coronavirus 2019-nCov

A novel coronavirus, called 2019-nCoV, was recently found in Wuhan, Hubei Province of China, and now is spreading across China and other parts of the world. Although there are some drugs to treat 2019-nCoV, there is no proper scientific evidence about its activity on the virus. It is of high significance to develop a drug that can combat the virus effectively to save valuable human lives. It usually takes a much longer time to develop a drug using traditional methods. For 2019-nCoV, it is now better to rely on some alternative methods such as deep learning to develop drugs that can combat such a disease effectively since 2019-nCoV is highly homologous to SARS-CoV. In the present work, we first collected virus RNA sequences of 18 patients reported to have 2019-nCoV from the public domain database, translated the RNA into protein sequences, and performed multiple sequence alignment. After a careful literature survey and sequence analysis, 3C-like protease is considered to be a major therapeutic target and we built a protein 3D model of 3C-like protease using homology modeling. Relying on the structural model, we used a pipeline to perform large scale virtual screening by using a deep learning based method to accurately rank/identify protein–ligand interacting pairs developed recently in our group. Our model identified potential drugs for 2019-nCoV 3C-like protease by performing drug screening against four chemical compound databases (Chimdiv, Targetmol-Approved_Drug_Library, Targetmol-Natural_Compound_Library, and Targetmol-Bioactive_Compound_Library) and a database of tripeptides. Through this paper, we provided the list of possible chemical ligands (Meglumine, Vidarabine, Adenosine, d -Sorbitol, d -Mannitol, Sodium_gluconate, Ganciclovir and Chlorobutanol) and peptide drugs (combination of isoleucine, lysine and proline) from the databases to guide the experimental scientists and validate the molecules which can combat the virus in a shorter time.

[1]  E. Holmes,et al.  Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding , 2020, The Lancet.

[2]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[3]  K. Battaile,et al.  Broad-Spectrum Antivirals against 3C or 3C-Like Proteases of Picornaviruses, Noroviruses, and Coronaviruses , 2012, Journal of Virology.

[4]  Fei Deng,et al.  Discovery of a novel coronavirus associated with the recent pneumonia outbreak in humans and its potential bat origin , 2020, bioRxiv.

[5]  C. Craik,et al.  Substrate specificity profiling and identification of a new class of inhibitor for the major protease of the SARS coronavirus. , 2007, Biochemistry.

[6]  Hongzhou Lu,et al.  Outbreak of pneumonia of unknown etiology in Wuhan, China: The mystery and the miracle , 2020, Journal of medical virology.

[7]  Nic Fleming,et al.  How artificial intelligence is changing drug discovery , 2018, Nature.

[8]  Jie Li,et al.  PDB-wide collection of binding data: current status of the PDBbind database , 2015, Bioinform..

[9]  Yanjie Wei,et al.  DeepBindRG: a deep learning based method for estimating effective protein–ligand affinity , 2019, PeerJ.

[10]  Y. Hu,et al.  Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China , 2020, The Lancet.

[11]  Konda Mani Saravanan,et al.  Search for identical octapeptides in unrelated proteins: Structural plasticity revisited. , 2012, Biopolymers.

[12]  Sabrina Jaeger,et al.  Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition , 2018, J. Chem. Inf. Model..

[13]  Fabian Sievers,et al.  Clustal Omega for making accurate alignments of many protein sequences , 2018, Protein science : a publication of the Protein Society.

[14]  Konda Mani Saravanan,et al.  Sequence fingerprints distinguish erroneous from correct predictions of Intrinsically Disordered Protein Regions , 2017, bioRxiv.

[15]  Robin N. Thompson,et al.  Pandemic potential of 2019-nCoV , 2020, The Lancet Infectious Diseases.

[16]  Yuelong Shu,et al.  GISAID: Global initiative on sharing all influenza data – from vision to reality , 2017, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[17]  Ping Chen,et al.  Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission , 2020, Science China Life Sciences.

[18]  Sara M. Santos,et al.  Biomedical applications of dipeptides and tripeptides , 2012, Biopolymers.

[19]  Matthew L. Danielson,et al.  Computer-aided drug design platform using PyMOL , 2011, J. Comput. Aided Mol. Des..

[20]  T. Yeates,et al.  Verification of protein structures: Patterns of nonbonded atomic interactions , 1993, Protein science : a publication of the Protein Society.

[21]  J. Baillie,et al.  Clinical evidence does not support corticosteroid treatment for 2019-nCoV lung injury , 2020, The Lancet.

[22]  Pascal Benkert,et al.  QMEAN: A comprehensive scoring function for model quality assessment , 2008, Proteins.

[23]  Sean Ekins,et al.  Exploiting machine learning for end-to-end drug discovery and development , 2019, Nature Materials.

[24]  Thomas Blaschke,et al.  The rise of deep learning in drug discovery. , 2018, Drug discovery today.

[25]  Kai Zhao,et al.  A pneumonia outbreak associated with a new coronavirus of probable bat origin , 2020, Nature.

[26]  Jiansong Fang,et al.  DeepScreening: a deep learning-based screening web server for accelerating drug discovery , 2019, Database J. Biol. Databases Curation.

[27]  Z. Memish,et al.  The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health — The latest 2019 novel coronavirus outbreak in Wuhan, China , 2020, International Journal of Infectious Diseases.

[28]  A. Sali,et al.  Modeller: generation and refinement of homology-based protein structure models. , 2003, Methods in enzymology.

[29]  Hao Wang,et al.  IVS2vec: A tool of Inverse Virtual Screening based on word2vec and deep learning techniques. , 2019, Methods.

[30]  A. Walls,et al.  Unexpected Receptor Functional Mimicry Elucidates Activation of Coronavirus Fusion , 2019, Cell.

[31]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.