IVS2vec: A tool of Inverse Virtual Screening based on word2vec and deep learning techniques.

Inverse Virtual Screening is a powerful technique in the early stage of drug discovery process. This technique can provide important clues for biologically active molecules, which is useful in the following researches of durg discovery. In this work, combining with Word2vec, a natural language processing technique, dense fully connected neural network (DFCNN) algorithm is utilized to build up a prediction model. This model is able to perform a binary classification. Based on the query molecule, the input protein candidates can be classified into two subsets. One set is that potential targets with high possibilities to bind with the query molecule and the other one is that the proteins with low possibilities to bind with the query molecule. This model is named as IVS2vec. IVS2vec also can output a score reflecting binding possibility of the association between a protein and a molecule, which is useful to improve efficiency of research. We applied IVS2vec on several databases related to drug development and shown that our model can detect possible therapeutic targets. In addition, our model can identify targets related to adverse drug reactions which is useful to improve medication safety and repurpose drugs. Moreover, IVS2vec can give a very fast speed to perform prediction jobs. It is suitable for processing a large number of compounds in the chemical databases. We also find that IVS2vec has potential capabilities and outperform other state-of-the-art docking tools such as Autodock vina. In this study, IVS2vec brings many convincing results than Autodock vina in the reverse target searching case of Quercetin.

[1]  Yan Song,et al.  Quercetin Treatment Improves Renal Function and Protects the Kidney in a Rat Model of Adenine-Induced Chronic Kidney Disease , 2018, Medical science monitor : international medical journal of experimental and clinical research.

[2]  Diego Bonatto,et al.  Inhibition of HDAC increases the senescence induced by natural polyphenols in glioma cells. , 2014, Biochemistry and cell biology = Biochimie et biologie cellulaire.

[3]  Vijay Pande,et al.  Quercitrin and quercetin 3-&bgr;-D-glucoside as chemical chaperones for the A4V SOD1 ALS-causing mutant , 2017, Protein engineering, design & selection : PEDS.

[4]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[5]  M. Noble,et al.  Recent developments in cyclin-dependent kinase biochemical and structural studies. , 2010, Biochimica et biophysica acta.

[6]  R. Yu,et al.  Quercetin suppresses MIP-1α-induced adipose inflammation by downregulating its receptors CCR1/CCR5 and inhibiting inflammatory signaling. , 2014, Journal of medicinal food.

[7]  Lipo Wang Support vector machines : theory and applications , 2005 .

[8]  Jung-Hsin Lin,et al.  idTarget: a web server for identifying protein targets of small chemical molecules with robust scoring functions and a divide-and-conquer docking approach , 2012, Nucleic Acids Res..

[9]  D. Firoozi,et al.  Effects of supplementation with quercetin on plasma C-reactive protein concentrations: a systematic review and meta-analysis of randomized controlled trials , 2017, European Journal of Clinical Nutrition.

[10]  E. Fluder,et al.  Latent semantic structure indexing (LaSSI) for defining chemical similarity. , 2001, Journal of medicinal chemistry.

[11]  A Cinats,et al.  Janus Kinase Inhibitors: A Review of Their Emerging Applications in Dermatology , 2018, Skin therapy letter.

[12]  Xuan Wu,et al.  Network Meta-Analysis of Erlotinib, Gefitinib, Afatinib and Icotinib in Patients with Advanced Non-Small-Cell Lung Cancer Harboring EGFR Mutations , 2014, PloS one.

[13]  Michael M. Mysinger,et al.  Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking , 2012, Journal of medicinal chemistry.

[14]  Renxiao Wang,et al.  The PDBbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures. , 2004, Journal of medicinal chemistry.

[15]  Ryan G. Coleman,et al.  ZINC: A Free Tool to Discover Chemistry for Biology , 2012, J. Chem. Inf. Model..

[16]  Y.Z. Chen,et al.  Ligand–protein inverse docking and its potential use in the computer search of protein targets of a small molecule , 2001, Proteins.

[17]  Min Du,et al.  Quercetin suppresses NLRP3 inflammasome activation in epithelial cells triggered by Escherichia coli O157:H7 , 2017, Free radical biology & medicine.

[18]  S. Gabriel,et al.  EGFR Mutations in Lung Cancer: Correlation with Clinical Response to Gefitinib Therapy , 2004, Science.

[19]  Inho Choi,et al.  Computer Aided Drug Design: Success and Limitations. , 2016, Current pharmaceutical design.

[20]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[21]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[22]  Frank Pajonk,et al.  The human immunodeficiency virus (HIV)-1 protease inhibitor saquinavir inhibits proteasome function and causes apoptosis and radiosensitization in non-HIV-associated human cancer cells. , 2002, Cancer research.

[23]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[24]  Tingting Fu,et al.  Therapeutic target database update 2018: enriched resource for facilitating bench-to-clinic research of targeted therapeutics , 2017, Nucleic Acids Res..

[25]  D. M. Ryan,et al.  4-Guanidino-2,4-dideoxy-2,3-dehydro-N-acetylneuraminic acid is a highly effective inhibitor both of the sialidase (neuraminidase) and of growth of a wide range of influenza A and B viruses in vitro , 1993, Antimicrobial Agents and Chemotherapy.

[26]  N. Pfeiffer,et al.  Dorzolamide: development and clinical application of a topical carbonic anhydrase inhibitor. , 1997, Survey of ophthalmology.

[27]  A. Nicholson,et al.  Mutations of the BRAF gene in human cancer , 2002, Nature.

[28]  B. Melnik,et al.  Leucine signaling in the pathogenesis of type 2 diabetes and obesity. , 2012, World journal of diabetes.

[29]  J. Irwin,et al.  Benchmarking sets for molecular docking. , 2006, Journal of medicinal chemistry.

[30]  M. Nasr-Esfahani,et al.  Ferulic Acid exerts concentration‐dependent anti‐apoptotic and neuronal differentiation‐inducing effects in PC12 and mouse neural stem cells , 2018, European journal of pharmacology.

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[32]  G S Stoewsand,et al.  Quercetin: a mutagen, not a carcinogen, in Fischer rats. , 1984, Journal of toxicology and environmental health.

[33]  Roy S Herbst,et al.  Review of epidermal growth factor receptor biology. , 2004, International journal of radiation oncology, biology, physics.

[34]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[35]  R. Solé,et al.  Data completeness—the Achilles heel of drug-target networks , 2008, Nature Biotechnology.

[36]  Solomon Nwaka,et al.  Innovative lead discovery strategies for tropical diseases , 2006, Nature Reviews Drug Discovery.

[37]  Stefan H. E. Kaufmann,et al.  Paul Ehrlich: founder of chemotherapy , 2008, Nature Reviews Drug Discovery.

[38]  Ravikumar Aalinkeel,et al.  The Flavonoid Quercetin Inhibits Proinflammatory Cytokine (Tumor Necrosis Factor Alpha) Gene Expression in Normal Peripheral Blood Mononuclear Cells via Modulation of the NF-κβ System , 2006, Clinical and Vaccine Immunology.

[39]  C Roland Wolf,et al.  Hepatic cytochrome P-450 reductase-null mice show reduced transcriptional response to quercetin and reveal physiological homeostasis between jejunum and liver. , 2006, American journal of physiology. Gastrointestinal and liver physiology.

[40]  A. Bender,et al.  Modeling Promiscuity Based on in vitro Safety Pharmacology Profiling Data , 2007, ChemMedChem.

[41]  Sabrina Jaeger,et al.  Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition , 2018, J. Chem. Inf. Model..

[42]  Jianyang Zeng,et al.  Deep learning with feature embedding for compound-protein interaction prediction , 2016, bioRxiv.

[43]  Madeleine Ennis,et al.  Connectivity mapping (ssCMap) to predict A20-inducing drugs and their antiinflammatory action in cystic fibrosis , 2016, Proceedings of the National Academy of Sciences.

[44]  Vinita B Pai,et al.  Nelfinavir Mesylate: A Protease Inhibitor , 1999, The Annals of pharmacotherapy.

[45]  Vincent Le Guilloux,et al.  fpocket: online tools for protein ensemble pocket detection and tracking , 2010, Nucleic Acids Res..

[46]  Azlina Abdul Aziz,et al.  Microarray analysis revealed different gene expression patterns in HepG2 cells treated with low and high concentrations of the extracts of Anacardium occidentale shoots , 2011, Genes & Nutrition.

[47]  Arthur J. Olson,et al.  AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading , 2009, J. Comput. Chem..

[48]  Maxat Kulmanov,et al.  Evaluating the effect of annotation size on measures of semantic similarity , 2017, Journal of Biomedical Semantics.

[49]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[50]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[51]  Fernando Q. Cunha,et al.  Quercetin reduces neutrophil recruitment induced by CXCL8, LTB4, and fMLP: inhibition of actin polymerization. , 2011, Journal of natural products.

[52]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[54]  Reed B. Jacob,et al.  DockoMatic - Automated Peptide Analog Creation for High Throughput Virtual Screening , 2011 .

[55]  Gordon M. Crippen,et al.  Prediction of Physicochemical Parameters by Atomic Contributions , 1999, J. Chem. Inf. Comput. Sci..

[56]  Shile Huang Inhibition of PI3K/Akt/mTOR signaling by natural products. , 2013, Anti-cancer agents in medicinal chemistry.

[57]  R. Glen,et al.  Molecular similarity: a key technique in molecular informatics. , 2004, Organic & biomolecular chemistry.

[58]  George Papadatos,et al.  The ChEMBL bioactivity database: an update , 2013, Nucleic Acids Res..

[59]  Xiaomin Luo,et al.  TarFisDock: a web server for identifying drug targets with docking approach , 2006, Nucleic Acids Res..

[60]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[61]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[62]  Ke Liu,et al.  ADReCS-Target: target profiles for aiding drug safety research and application , 2017, Nucleic Acids Res..

[63]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.