Network predicting drug's anatomical therapeutic chemical code

MOTIVATION Discovering drug's Anatomical Therapeutic Chemical (ATC) classification rules at molecular level is of vital importance to understand a vast majority of drugs action. However, few studies attempt to annotate drug's potential ATC-codes by computational approaches. RESULTS Here, we introduce drug-target network to computationally predict drug's ATC-codes and propose a novel method named NetPredATC. Starting from the assumption that drugs with similar chemical structures or target proteins share common ATC-codes, our method, NetPredATC, aims to assign drug's potential ATC-codes by integrating chemical structures and target proteins. Specifically, we first construct a gold-standard positive dataset from drugs' ATC-code annotation databases. Then we characterize ATC-code and drug by their similarity profiles and define kernel function to correlate them. Finally, we use a kernel method, support vector machine, to automatically predict drug's ATC-codes. Our method was validated on four drug datasets with various target proteins, including enzymes, ion channels, G-protein couple receptors and nuclear receptors. We found that both drug's chemical structure and target protein are predictive, and target protein information has better accuracy. Further integrating these two data sources revealed more experimentally validated ATC-codes for drugs. We extensively compared our NetPredATC with SuperPred, which is a chemical similarity-only based method. Experimental results showed that our NetPredATC outperforms SuperPred not only in predictive coverage but also in accuracy. In addition, database search and functional annotation analysis support that our novel predictions are worthy of future experimental validation. CONCLUSION In conclusion, our new method, NetPredATC, can predict drug's ATC-codes more accurately by incorporating drug-target network and integrating data, which will promote drug mechanism understanding and drug repositioning and discovery. AVAILABILITY NetPredATC is available at http://doc.aporc.org/wiki/NetPredATC. CONTACT ycwang@nwipb.cas.cn or ywang@amss.ac.cn SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Yoshihiro Yamanishi,et al.  Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework , 2010, Bioinform..

[2]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[3]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[4]  Antje Chang,et al.  New Developments , 2003 .

[5]  Stefan Günther,et al.  SuperPred: drug classification and target prediction , 2008, Nucleic Acids Res..

[6]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[7]  Yoshihiro Yamanishi,et al.  Prediction of drug–target interaction networks from the integration of chemical and genomic spaces , 2008, ISMB.

[8]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[9]  Robert B. Russell,et al.  SuperTarget and Matador: resources for exploring drug-target relationships , 2007, Nucleic Acids Res..

[10]  Alex Smola,et al.  Kernel methods in machine learning , 2007, math/0701907.

[11]  M. Kanehisa,et al.  Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. , 2003, Journal of the American Chemical Society.

[12]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[13]  Yong Wang,et al.  Computationally Probing Drug-Protein Interactions Via Support Vector Machine , 2010 .

[14]  Shiwen Zhao,et al.  Network-Based Relating Pharmacological and Genomic Spaces for Drug Target Identification , 2010, PloS one.

[15]  Michael Gribskov,et al.  Use of Receiver Operating Characteristic (ROC) Analysis to Evaluate Sequence Matching , 1996, Comput. Chem..

[16]  P. Bork,et al.  Drug Target Identification Using Side-Effect Similarity , 2008, Science.

[17]  P. Bork,et al.  Network Neighbors of Drug Targets Contribute to Drug Side-Effect Similarity , 2011, PloS one.

[18]  Chunhua Zhang,et al.  Kernel-based data fusion improves the drug-protein interaction prediction , 2011, Comput. Biol. Chem..

[19]  Edward Y. Chang,et al.  Class-Boundary Alignment for Imbalanced Dataset Learning , 2003 .

[20]  Juho Rousu,et al.  Kernel-Based Learning of Hierarchical Multilabel Classification Models , 2006, J. Mach. Learn. Res..

[21]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[22]  William Stafiord Noble,et al.  Support vector machine applications in computational biology , 2004 .

[23]  Andy Gray,et al.  The selection and use of essential medicines. , 2008, World Health Organization technical report series.

[24]  Michael Q. Zhang,et al.  Network-based global inference of human disease genes , 2008, Molecular systems biology.

[25]  Rui Jiang,et al.  Integrating multiple protein-protein interaction networks to prioritize disease genes: a Bayesian regression approach , 2011, BMC Bioinformatics.