Drug repositioning of herbal compounds via a machine-learning approach

BackgroundDrug repositioning, also known as drug repurposing, defines new indications for existing drugs and can be used as an alternative to drug development. In recent years, the accumulation of large volumes of information related to drugs and diseases has led to the development of various computational approaches for drug repositioning. Although herbal medicines have had a great impact on current drug discovery, there are still a large number of herbal compounds that have no definite indications.ResultsIn the present study, we constructed a computational model to predict the unknown pharmacological effects of herbal compounds using machine learning techniques. Based on the assumption that similar diseases can be treated with similar drugs, we used four categories of drug-drug similarity (e.g., chemical structure, side-effects, gene ontology, and targets) and three categories of disease-disease similarity (e.g., phenotypes, human phenotype ontology, and gene ontology). Then, associations between drug and disease were predicted using the employed similarity features. The prediction models were constructed using classification algorithms, including logistic regression, random forest and support vector machine algorithms. Upon cross-validation, the random forest approach showed the best performance (AUC = 0.948) and also performed well in an external validation assessment using an unseen independent dataset (AUC = 0.828). Finally, the constructed model was applied to predict potential indications for existing drugs and herbal compounds. As a result, new indications for 20 existing drugs and 31 herbal compounds were predicted and validated using clinical trial data.ConclusionsThe predicted results were validated manually confirming the performance and underlying mechanisms – for example, irinotecan as a treatment for neuroblastoma. From the prediction, herbal compounds were considered to be drug candidates for related diseases which is important to be further developed. The proposed prediction model can contribute to drug discovery by suggesting drug candidates from herbal compounds which have potentials but few were studied.

[1]  Zhiyong Lu,et al.  A survey of current trends in computational drug repositioning , 2016, Briefings Bioinform..

[2]  Damian Szklarczyk,et al.  STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data , 2015, Nucleic Acids Res..

[3]  T. Ashburn,et al.  Drug repositioning: identifying and developing new uses for existing drugs , 2004, Nature Reviews Drug Discovery.

[4]  Susumu Goto,et al.  Data, information, knowledge and principle: back to metabolism in KEGG , 2013, Nucleic Acids Res..

[5]  R. Tagliaferri,et al.  Discovery of drug mode of action and drug repositioning from transcriptional responses , 2010, Proceedings of the National Academy of Sciences.

[6]  Temple F. Smith,et al.  The statistical distribution of nucleic acid similarities. , 1985, Nucleic acids research.

[7]  C. Witt,et al.  Traditional Japanese Kampo Medicine: Clinical Research between Modernity and Traditional Medicine—The State of Research and Methodological Suggestions for the Future , 2011, Evidence-based complementary and alternative medicine : eCAM.

[8]  Yung-Hsien Chang,et al.  Efficacy and Safety of a Chinese Herbal Medicine Formula (RCM-104) in the Management of Simple Obesity: A Randomized, Placebo-Controlled Clinical Trial , 2012, Evidence-based complementary and alternative medicine : eCAM.

[9]  Bo Zhang,et al.  Herb network construction and co-module analysis for uncovering the combination rule of traditional Chinese herbal formulae , 2010, BMC Bioinform..

[10]  Zhao Fang,et al.  TCMID: traditional Chinese medicine integrative database for herb molecular mechanism analysis , 2012, Nucleic Acids Res..

[11]  R. Sharan,et al.  PREDICT: a method for inferring novel drug indications with application to personalized medicine , 2011, Molecular systems biology.

[12]  D. Swinney,et al.  How were new medicines discovered? , 2011, Nature Reviews Drug Discovery.

[13]  Yi Pan,et al.  Drug repositioning based on comprehensive similarity measures and Bi-Random walk algorithm , 2016, Bioinform..

[14]  Joshua F. McMichael,et al.  DGIdb - Mining the druggable genome , 2013, Nature Methods.

[15]  F. Pammolli,et al.  The productivity crisis in pharmaceutical R&D , 2011, Nature Reviews Drug Discovery.

[16]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[17]  Yan Zhao,et al.  Drug repositioning: a machine-learning approach through data integration , 2013, Journal of Cheminformatics.

[18]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[19]  Xiaojie Xu New concepts and approaches for drug discovery based on traditional Chinese medicine , 2006, Drug Discovery Today: Technologies.

[20]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[21]  Chao Huang,et al.  Systems pharmacology in drug discovery and therapeutic insight for herbal medicines , 2014, Briefings Bioinform..

[22]  R. Altman,et al.  Data-Driven Prediction of Drug Effects and Interactions , 2012, Science Translational Medicine.

[23]  Philip E. Bourne,et al.  PROMISCUOUS: a database for network-based drug-repositioning , 2010, Nucleic Acids Res..

[24]  Yibo Wu,et al.  GOSemSim: an R package for measuring semantic similarity among GO terms and gene products , 2010, Bioinform..

[25]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[26]  Lin Gao,et al.  HPOSim: An R Package for Phenotypic Similarity Measure and Enrichment Analysis Based on the Human Phenotype Ontology , 2015, PloS one.

[27]  Yongcui Wang,et al.  Drug Repositioning by Kernel-Based Integration of Molecular Structure, Molecular Activity, and Phenotype Data , 2013, PloS one.

[28]  Xin Wen,et al.  BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities , 2006, Nucleic Acids Res..

[29]  P. Komesaroff,et al.  A Chinese Herbal Preparation Containing Radix Salviae Miltiorrhizae, Radix Notoginseng and Borneolum Syntheticum Reduces Circulating Adhesion Molecules , 2011, Evidence-based complementary and alternative medicine : eCAM.

[30]  C E Lipscomb,et al.  Medical Subject Headings (MeSH). , 2000, Bulletin of the Medical Library Association.

[31]  Peer Bork,et al.  The SIDER database of drugs and side effects , 2015, Nucleic Acids Res..

[32]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[33]  Pankaj Agarwal,et al.  Systematic Drug Repositioning Based on Clinical Side-Effects , 2011, PloS one.

[34]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[35]  F. Cheung TCM: Made in China , 2011, Nature.

[36]  Aiping Lu,et al.  Traditional Chinese Medicine-Based Network Pharmacology Could Lead to New Multicompound Drug Discovery , 2012, Evidence-based complementary and alternative medicine : eCAM.

[37]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[38]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[39]  Robert Petryszak,et al.  UniChem: a unified chemical structure cross-referencing and identifier tracking system , 2013, Journal of Cheminformatics.

[40]  Zongliang Yue,et al.  DMAP: a connectivity map database to enable identification of novel drug repositioning candidates , 2015, BMC Bioinformatics.

[41]  P. Robinson,et al.  The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. , 2008, American journal of human genetics.

[42]  Roded Sharan,et al.  Combining Drug and Gene Similarity Measures for Drug-Target Elucidation , 2011, J. Comput. Biol..

[43]  Núria Queralt-Rosinach,et al.  DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants , 2016, Nucleic Acids Res..

[44]  M. Nalls,et al.  Genome-Wide Association Study of Retinopathy in Individuals without Diabetes , 2013, PloS one.

[45]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[46]  Y. Z. Chen,et al.  Database of traditional Chinese medicine and its application to studies of mechanism and to prescription validation , 2006, British journal of pharmacology.

[47]  Yoshihiro Yamanishi,et al.  Systematic Drug Repositioning for a Wide Range of Diseases with Integrative Analyses of Phenotypic and Molecular Data , 2015, J. Chem. Inf. Model..

[48]  Liang Liu,et al.  Network-based drug discovery by integrating systems biology and computational technologies , 2012, Briefings Bioinform..