Optimizing drug–target interaction prediction based on random walk on heterogeneous networks

BackgroundPredicting novel drug–target associations is important not only for developing new drugs, but also for furthering biological knowledge by understanding how drugs work and their modes of action. As more data about drugs, targets, and their interactions becomes available, computational approaches have become an indispensible part of drug target association discovery. In this paper we apply random walk with restart (RWR) method to a heterogeneous network of drugs and targets compiled from DrugBank database and investigate the performance of the method under parameter variation and choice of chemical fingerprint methods.ResultsWe show that choice of chemical fingerprint does not affect the performance of the method when the parameters are tuned to optimal values. We use a subset of the ChEMBL15 dataset that contains 2,763 associations between 544 drugs and 467 target proteins to evaluate our method, and we extracted datasets of bioactivity ≤1 and ≤10 μM activity cutoff. For 1 μM bioactivity cutoff, we find that our method can correctly predict nearly 47, 55, 60% of the given drug–target interactions in the test dataset having more than 0, 1, 2 drug target relations for ChEMBL 1 μM dataset in top 50 rank positions. For 10 μM bioactivity cutoff, we find that our method can correctly predict nearly 32.4, 34.8, 35.3% of the given drug–target interactions in the test dataset having more than 0, 1, 2 drug target relations for ChEMBL 1 μM dataset in top 50 rank positions. We further examine the associations between 110 popular top selling drugs in 2012 and 3,519 targets and find the top ten targets for each drug.ConclusionsWe demonstrate the effectiveness and promise of the approach—RWR on heterogeneous networks using chemical features—for identifying novel drug target interactions and investigate the performance.

[1]  Yanli Wang,et al.  PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[2]  Xing Chen,et al.  Drug-target interaction prediction by random walk on the heterogeneous network. , 2012, Molecular bioSystems.

[3]  James Hendler,et al.  Google’s PageRank and Beyond: The Science of Search Engine Rankings , 2007 .

[4]  H. L. Morgan The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. , 1965 .

[5]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[6]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[7]  Yoshihiro Yamanishi,et al.  Supervised prediction of drug–target interactions using bipartite local models , 2009, Bioinform..

[8]  B. Roth,et al.  The Multiplicity of Serotonin Receptors: Uselessly Diverse Molecules or an Embarrassment of Riches? , 2000 .

[9]  Dharmaranjan Sriram,et al.  Enhanced ranking of PknB Inhibitors using data fusion methods , 2013, Journal of Cheminformatics.

[10]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[11]  Michael S Lajiness,et al.  Systems chemical biology and the Semantic Web: what they mean for the future of drug discovery research. , 2012, Drug discovery today.

[12]  Amy Nicole Langville,et al.  Google's PageRank and beyond - the science of search engine rankings , 2006 .

[13]  Christopher I. Bayly,et al.  Evaluating Virtual Screening Methods: Good and Bad Metrics for the "Early Recognition" Problem , 2007, J. Chem. Inf. Model..

[14]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[15]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[16]  Bin Chen,et al.  Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data , 2010, BMC Bioinformatics.

[17]  Bin Chen,et al.  Assessing Drug Target Association Using Semantic Linked Data , 2012, PLoS Comput. Biol..

[18]  David S. Wishart,et al.  DrugBank 3.0: a comprehensive resource for ‘Omics’ research on drugs , 2010, Nucleic Acids Res..

[19]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..