Improving chemical similarity ensemble approach in target prediction

BackgroundIn silico target prediction of compounds plays an important role in drug discovery. The chemical similarity ensemble approach (SEA) is a promising method, which has been successfully applied in many drug-related studies. There are various models available analogous to SEA, because this approach is based on different types of molecular fingerprints. To investigate the influence of training data selection and the complementarity of different models, several SEA models were constructed and tested.ResultsWhen we used a test set of 37,138 positive and 42,928 negative ligand-target interactions, among the five tested molecular fingerprint methods, at significance level 0.05, Topological-based model yielded the best precision rate (83.7 %) and $${F_{0.25}{\text{-}}Measure}$$F0.25-Measure (0.784) while Atom pair-based model yielded the best $$F_{0.5}{\text{-}}Measure$$F0.5-Measure (0.694). By employing an election system to combine the five models, a flexible prediction scheme was achieved with precision range from 71 to 90.6 %, $$F_{0.5}{\text{-}}Measure$$F0.5-Measure range from 0.663 to 0.684 and $$F_{0.25}{\text{-}}Measure$$F0.25-Measure range from 0.696 to 0.817.ConclusionsThe overall effectiveness of all of the five models could be ranked in decreasing order as follows: Atom pair $$\approx$$≈ Topological > Morgan > MACCS > Pharmacophore. Combining multiple SEA models, which takes advantages of different models, could be used to improve the success rates of the models. Another possibility of improving the model could be using target-specific classes or more active compounds.

[1]  J. Bajorath,et al.  Advancing the activity cliff concept , 2013 .

[2]  Scott Boyer,et al.  Ligand-Based Approach to In Silico Pharmacology: Nuclear Receptor Profiling , 2006, J. Chem. Inf. Model..

[3]  John P. Overington,et al.  A ligand's-eye view of protein similarity , 2013, Nature Methods.

[4]  J. Henney Infant Pneumococcal Vaccine , 2000 .

[5]  J. Mestres,et al.  A ligand-based approach to mining the chemogenomic space of drugs. , 2008, Combinatorial chemistry & high throughput screening.

[6]  M. Soares,et al.  Antimalarial activity of physalins B, D, F, and G. , 2011, Journal of natural products.

[7]  Maria F. Sassano,et al.  A Pharmacological Organization of G Protein-coupled Receptors , 2012, Nature Methods.

[8]  W. Pearson Empirical statistical estimates for sequence similarity searches. , 1998, Journal of molecular biology.

[9]  Michael J. Keiser,et al.  Large Scale Prediction and Testing of Drug Activity on Side-Effect Targets , 2012, Nature.

[10]  Brian K. Shoichet,et al.  Chemical informatics uncovers a new role for moexipril as a novel inhibitor of cAMP phosphodiesterase-4 (PDE4) , 2013, Biochemical pharmacology.

[11]  Jordi Mestres,et al.  SHED: Shannon Entropy Descriptors from Topological Feature Distributions , 2006, J. Chem. Inf. Model..

[12]  Richard E. Turner,et al.  A multi-label approach to target prediction taking ligand promiscuity into account , 2015, Journal of Cheminformatics.

[13]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.

[14]  G. V. Paolini,et al.  Global mapping of pharmacological space , 2006, Nature Biotechnology.

[15]  Gerald M. Maggiora,et al.  On Outliers and Activity Cliffs-Why QSAR Often Disappoints , 2006, J. Chem. Inf. Model..

[16]  George Papadatos,et al.  The ChEMBL bioactivity database: an update , 2013, Nucleic Acids Res..

[17]  Tudor I. Oprea,et al.  Quantifying the Relationships among Drug Classes , 2008, J. Chem. Inf. Model..

[18]  Thierry Kogej,et al.  Multifingerprint Based Similarity Searches for Targeted Class Compound Selection , 2006, J. Chem. Inf. Model..

[19]  Zhiyong Lu,et al.  The CHEMDNER corpus of chemicals and drugs and its annotation principles , 2015, Journal of Cheminformatics.

[20]  Alexander Klenner,et al.  'Fuzziness' in pharmacophore-based virtual screening and de novo design. , 2010, Drug discovery today. Technologies.

[21]  J. Henney Withdrawal of Troglitazone and Cisapride , 2000 .

[22]  C. Chong,et al.  New uses for old drugs , 2007, Nature.

[23]  Hoyun Lee,et al.  Chloroquine and its analogs: a new promise of an old drug for effective and safe cancer therapies. , 2009, European journal of pharmacology.

[24]  Xianghui Liu,et al.  SVM Model for Virtual Screening of Lck Inhibitors , 2009, J. Chem. Inf. Model..

[25]  Angelo D. Favia,et al.  Protein promiscuity and its implications for biotechnology , 2009, Nature Biotechnology.

[26]  Benjamin Parent,et al.  Fuzzy Tricentric Pharmacophore Fingerprints, 1. Topological Fuzzy Pharmacophore Triplets and Adapted Molecular Similarity Scoring Schemes , 2006, J. Chem. Inf. Model..

[27]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[28]  Jürgen Bajorath,et al.  Extending the Activity Cliff Concept: Structural Categorization of Activity Cliffs and Systematic Identification of Different Types of Cliffs in the ChEMBL Database , 2012, J. Chem. Inf. Model..

[29]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[30]  Oakland J. Peters,et al.  Predicting new indications for approved drugs using a proteochemometric method. , 2012, Journal of medicinal chemistry.

[31]  Shuxing Zhang,et al.  Polypharmacology: drug discovery for the future , 2013, Expert review of clinical pharmacology.

[32]  R. Venkataraghavan,et al.  Atom pairs as molecular features in structure-activity studies: definition and applications , 1985, J. Chem. Inf. Comput. Sci..

[33]  Gobbi,et al.  Genetic optimization of combinatorial libraries , 1998, Biotechnology and bioengineering.

[34]  Yufeng Liu,et al.  Relating Anatomical Therapeutic Indications by the Ensemble Similarity of Drug Sets , 2013, J. Chem. Inf. Model..