Predictive Activity Profiling of Drugs by Topological-Fragment-Spectra-Based Support Vector Machines

Aiming at the prediction of pleiotropic effects of drugs, we have investigated the multilabel classification of drugs that have one or more of 100 different kinds of activity labels. Structural feature representation of each drug molecule was based on the topological fragment spectra method, which was proposed in our previous work. Support vector machine (SVM) was used for the classification and the prediction of their activity classes. Multilabel classification was carried out by a set of the SVM classifiers. The collective SVM classifiers were trained with a training set of 59,180 compounds and validated by another set (validation set) of 29,590 compounds. For a test set that consists of 9,864 compounds, the classifiers correctly classified 80.8% of the drugs into their own active classes. The SVM classifiers also successfully performed predictions of the activity spectra for multilabel compounds.

[1]  L. Tabár,et al.  Sojourn time, sensitivity and positive predictive value of mammography screening for breast cancer in women aged 40-49. , 1996, International journal of epidemiology.

[2]  G. Bemis,et al.  The properties of known drugs. 1. Molecular frameworks. , 1996, Journal of medicinal chemistry.

[3]  Yoshimasa Takahashi,et al.  Structural Similarity Analysis Based on Topological Fragment Spectra , 1998 .

[4]  Vladimir Poroikov,et al.  Chemical Similarity Assessment through Multilevel Neighborhoods of Atoms: Definition and Comparison with the Other Descriptors , 1999, J. Chem. Inf. Comput. Sci..

[5]  S. Ekins,et al.  Three- and four-dimensional quantitative structure activity relationship analyses of cytochrome P-450 3A4 inhibitors. , 1999, The Journal of pharmacology and experimental therapeutics.

[6]  J. Farmer Pleiotropic effects of statins , 2000, Current atherosclerosis reports.

[7]  Y.Z. Chen,et al.  Ligand–protein inverse docking and its potential use in the computer search of protein targets of a small molecule , 2001, Proteins.

[8]  V. Poroikov,et al.  Robustness of Biological Activity Spectra Predicting by Computer Program PASS for Noncongeneric Sets of Chemical Compounds , 2000, Journal of chemical information and computer sciences.

[9]  S. Anzali,et al.  Discriminating between drugs and nondrugs by prediction of activity spectra for substances (PASS). , 2001, Journal of medicinal chemistry.

[10]  Yoshimasa Takahashi,et al.  Classification of Pharmacological Activity of Drugs Using Support Vector Machine , 2003, Active Mining.

[11]  Steven L. Dixon,et al.  In silico models for the prediction of dose-dependent human hepatotoxicity , 2003, J. Comput. Aided Mol. Des..

[12]  John C. Dearden,et al.  In silico prediction of drug toxicity , 2003, J. Comput. Aided Mol. Des..

[13]  H. van de Waterbeemd,et al.  ADMET in silico modelling: towards prediction paradise? , 2003, Nature reviews. Drug discovery.

[14]  Wolf-Dietrich Ihlenfeldt,et al.  PASS Biological Activity Spectrum Predictions in the Enhanced Open NCI Database Browser , 2003, J. Chem. Inf. Comput. Sci..

[15]  Jens Sadowski,et al.  Comparison of Support Vector Machine and Artificial Neural Network Systems for Drug/Nondrug Classification , 2003, J. Chem. Inf. Comput. Sci..

[16]  Yoshimasa Takahashi Chemical data mining based on non-terminal vertex graph , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[17]  JeanDavignon Beneficial Cardiovascular Pleiotropic Effects of Statins , 2004 .

[18]  Yoshimasa Takahashi,et al.  Classification of Dopamine Antagonists Using TFS‐Based Artificial Neural Network. , 2004 .

[19]  Yoshimasa Takahashi,et al.  Identification of Dopamine D1 Receptor Agonists and Antagonists under Existing Noise Compounds by TFS-based ANN and SVM , 2005 .

[20]  Ronald D Snyder,et al.  Computational prediction of genotoxicity: room for improvement. , 2005, Drug discovery today.

[21]  J. Jenkins,et al.  Prediction of Biological Targets for Compounds Using Multiple‐Category Bayesian Models Trained on Chemogenomics Databases. , 2006 .

[22]  Z. Lepp,et al.  Screening for New Antidepressant Leads of Multiple Activities by Support Vector Machines. , 2006 .

[23]  T. Niwa,et al.  Quantitative Structure—Activity Relationship Studies on Inhibition of HERG Potassium Channels. , 2006 .

[24]  Z. Deng,et al.  Bridging chemical and biological space: "target fishing" using 2D and 3D molecular descriptors. , 2006, Journal of medicinal chemistry.

[25]  Bin Zhou,et al.  Large-Scale Annotation of Small-Molecule Libraries Using Public Databases , 2007, J. Chem. Inf. Model..

[26]  Jian Wang,et al.  In Silico Elucidation of the Molecular Mechanism Defining the Adverse Effect of Selective Estrogen Receptor Modulators , 2007, PLoS Comput. Biol..

[27]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[28]  C. G. Mohan,et al.  Computer-assisted methods in chemical toxicity prediction. , 2007, Mini reviews in medicinal chemistry.

[29]  D. Lewis Computer‐Assisted methods in the evaluation of chemical toxicity , 2007 .

[30]  Takashi Okada,et al.  Extended Study of the Classification of Dopamine Receptor Agonists and Antagonists using a TFS-based Support Vector Machine , 2007, New Generation Computing.

[31]  A. Bender,et al.  Analysis of Pharmacology Data and the Prediction of Adverse Drug Reactions and Off‐Target Effects from Chemical Structure , 2007, ChemMedChem.

[32]  J. Gasteiger,et al.  Multilabeled Classification Approach to Find a Plant Source for Terpenoids. , 2008 .