MOST: most-similar ligand based approach to target prediction

BackgroundMany computational approaches have been used for target prediction, including machine learning, reverse docking, bioactivity spectra analysis, and chemical similarity searching. Recent studies have suggested that chemical similarity searching may be driven by the most-similar ligand. However, the extent of bioactivity of most-similar ligands has been oversimplified or even neglected in these studies, and this has impaired the prediction power.ResultsHere we propose the MOst-Similar ligand-based Target inference approach, namely MOST, which uses fingerprint similarity and explicit bioactivity of the most-similar ligands to predict targets of the query compound. Performance of MOST was evaluated by using combinations of different fingerprint schemes, machine learning methods, and bioactivity representations. In sevenfold cross-validation with a benchmark Ki dataset from CHEMBL release 19 containing 61,937 bioactivity data of 173 human targets, MOST achieved high average prediction accuracy (0.95 for pKi ≥ 5, and 0.87 for pKi ≥ 6). Morgan fingerprint was shown to be slightly better than FP2. Logistic Regression and Random Forest methods performed better than Naïve Bayes. In a temporal validation, the Ki dataset from CHEMBL19 were used to train models and predict the bioactivity of newly deposited ligands in CHEMBL20. MOST also performed well with high accuracy (0.90 for pKi ≥ 5, and 0.76 for pKi ≥ 6), when Logistic Regression and Morgan fingerprint were employed. Furthermore, the p values associated with explicit bioactivity were found be a robust index for removing false positive predictions. Implicit bioactivity did not offer this capability. Finally, p values generated with Logistic Regression, Morgan fingerprint and explicit activity were integrated with a false discovery rate (FDR) control procedure to reduce false positives in multiple-target prediction scenario, and the success of this strategy it was demonstrated with a case of fluanisone. In the case of aloe-emodin’s laxative effect, MOST predicted that acetylcholinesterase was the mechanism-of-action target; in vivo studies validated this prediction.ConclusionsUsing the MOST approach can result in highly accurate and robust target prediction. Integrated with a FDR control procedure, MOST provides a reliable framework for multiple-target inference. It has prospective applications in drug repurposing and mechanism-of-action target prediction.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  Yong Huang,et al.  Large-Scale Chemical Similarity Networks for Target Profiling of Compounds Identified in Cell-Based Chemical Screens , 2015, PLoS Comput. Biol..

[3]  V. Vasić,et al.  Send Orders of Reprints at Reprints@benthamscience.net Acetylcholinesterase Inhibitors: Pharmacology and Toxicology , 2022 .

[4]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[5]  Lirong Wang,et al.  TargetHunter: An In Silico Target Identification Tool for Predicting Therapeutic Potential of Small Organic Molecules Based on Chemogenomic Database , 2013, The AAPS Journal.

[6]  Meir Glick,et al.  Prediction of Biological Targets for Compounds Using Multiple-Category Bayesian Models Trained on Chemogenomics Databases , 2006, J. Chem. Inf. Model..

[7]  Mi-Kyung Sung,et al.  Evaluation of Aloin and Aloe-Emodin as Anti-Inflammatory Agents in Aloe by Using Murine Macrophages , 2009, Bioscience, biotechnology, and biochemistry.

[8]  Andreas Bender,et al.  Target prediction utilising negative bioactivity data covering large chemical space , 2015, Journal of Cheminformatics.

[9]  Chih-Jen Lin,et al.  Dual coordinate descent methods for logistic regression and maximum entropy models , 2011, Machine Learning.

[10]  H. Matter,et al.  Selecting optimally diverse compounds from structure databases: a validation study of two-dimensional and three-dimensional molecular descriptors. , 1997, Journal of medicinal chemistry.

[11]  Vladimir Poroikov,et al.  PASS: prediction of activity spectra for biologically active substances , 2000, Bioinform..

[12]  Michael J. Keiser,et al.  Large Scale Prediction and Testing of Drug Activity on Side-Effect Targets , 2012, Nature.

[13]  Yixin Chen,et al.  Implementation of multiple-instance learning in drug activity prediction , 2012, BMC Bioinformatics.

[14]  Ajay N. Jain,et al.  Robust ligand-based modeling of the biological targets of known drugs. , 2006, Journal of medicinal chemistry.

[15]  John A. Tallarico,et al.  Use of ligand based models for protein domains to predict novel molecular targets and applications to triage affinity chromatography data. , 2009, Journal of proteome research.

[16]  L. Scott,et al.  Acotiamide: First Global Approval , 2013, Drugs.

[17]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[18]  Raman Sharma,et al.  ElectroShape: fast molecular similarity calculations incorporating shape, chirality and electrostatics , 2010, J. Comput. Aided Mol. Des..

[19]  R. Glen,et al.  Molecular similarity: a key technique in molecular informatics. , 2004, Organic & biomolecular chemistry.

[20]  Andreas Bender,et al.  Global Mapping of Traditional Chinese Medicine into Bioactivity Space and Pathways Annotation Improves Mechanistic Understanding and Discovers Relationships between Therapeutic Action (Sub)classes , 2016, Evidence-based complementary and alternative medicine : eCAM.

[21]  R M Eglen,et al.  Muscarinic receptors and gastrointestinal tract smooth muscle function. , 2001, Life sciences.

[22]  G. Maggiora,et al.  Molecular similarity in medicinal chemistry. , 2014, Journal of medicinal chemistry.

[23]  Andreas Bender,et al.  Chemogenomics Approaches to Rationalizing the Mode-of-Action of Traditional Chinese and Ayurvedic Medicines , 2013, J. Chem. Inf. Model..

[24]  Y. Martin,et al.  Do structurally similar molecules have similar biological activity? , 2002, Journal of medicinal chemistry.

[25]  Olivier Michielin,et al.  Shaping the interaction landscape of bioactive molecules , 2013, Bioinform..

[26]  Dariusz Plewczynski,et al.  Target specific compound identification using a support vector machine. , 2007, Combinatorial chemistry & high throughput screening.

[27]  W. Graham Richards,et al.  Improving the accuracy of ultrafast ligand-based screening: incorporating lipophilicity into ElectroShape as an extra dimension , 2011, J. Comput. Aided Mol. Des..

[28]  J. Vossen,et al.  Effects of the neuroleptanalgesic fentanyl-fluanisone (Hypnorm) on spike-wave discharges in epileptic rats , 1994, Pharmacology Biochemistry and Behavior.

[29]  A. Vulpetti,et al.  The experimental uncertainty of heterogeneous public K(i) data. , 2012, Journal of medicinal chemistry.

[30]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.

[32]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[33]  Adrià Cereto-Massagué,et al.  Tools for in silico target fishing. , 2015, Methods.

[34]  R. Olsen,et al.  Identification of Direct Protein Targets of Small Molecules , 2010, ACS chemical biology.

[35]  F. J. Luque,et al.  Synthesis and multitarget biological profiling of a novel family of rhein derivatives as disease-modifying anti-Alzheimer agents. , 2014, Journal of medicinal chemistry.

[36]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[37]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[38]  Wai-Kit Law,et al.  Acetylshikonin, a Novel AChE Inhibitor, Inhibits Apoptosis via Upregulation of Heme Oxygenase-1 Expression in SH-SY5Y Cells , 2013, Evidence-based complementary and alternative medicine : eCAM.

[39]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[40]  G. Golub,et al.  Updating formulae and a pairwise algorithm for computing sample variances , 1979 .

[41]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[42]  Aurélien Grosdidier,et al.  SwissTargetPrediction: a web server for target prediction of bioactive small molecules , 2014, Nucleic Acids Res..