Discrimination of approved drugs from experimental drugs by learning methods

BackgroundTo assess whether a compound is druglike or not as early as possible is always critical in drug discovery process. There have been many efforts made to create sets of 'rules' or 'filters' which, it is hoped, will help chemists to identify 'drug-like' molecules from 'non-drug' molecules. However, among the chemical space of the druglike molecules, the minority will be approved drugs. Classifying approved drugs from experimental drugs may be more helpful to obtain future approved drugs. Therefore, discrimination of approved drugs from experimental ones has been done in this paper by analyzing the compounds in terms of existing drugs features and machine learning methods.ResultsFour methodologies were compared by their performance to classify approved drugs from experimental ones. The best results were obtained by SVM, in which the accuracy is 0.7911, the sensitivity is 0.5929, and the specificity is 0.8743. Based on the results, consensus model was developed to effectively discriminate drugs, which further pushed the correct classification rate up to 0.8517, sensitivity up to 0.7242, specificity up to 0.9352. The applications on the Traditional Chinese Medicine Ingredients Database (TCM-ID) tested the methods. Therefore this model has been proven to be a potent tool for identifying drug molecules.ConclusionThe studies would have potential applications in the research of combinatorial library design and virtual high throughput screening for drug discovery.

[1]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[2]  Jens Sadowski,et al.  Comparison of Support Vector Machine and Artificial Neural Network Systems for Drug/Nondrug Classification , 2003, J. Chem. Inf. Comput. Sci..

[3]  Miklos Feher,et al.  Property Distributions: Differences between Drugs, Natural Products, and Molecules from Combinatorial Chemistry , 2003, J. Chem. Inf. Comput. Sci..

[4]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[5]  A. Ghose,et al.  Atomic physicochemical parameters for three dimensional structure directed quantitative structure‐activity relationships III: Modeling hydrophobic interactions , 1988 .

[6]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[7]  Svante Wold,et al.  Personal memories of the early PLS development , 2001 .

[8]  Jun Xu,et al.  Drug-like Index: A New Approach To Measure Drug-like Compounds and Their Diversity , 2000, J. Chem. Inf. Comput. Sci..

[9]  H. Zhou,et al.  Traditional Chinese medicine information database , 2005, Journal of ethnopharmacology.

[10]  Ajay,et al.  Can we learn to distinguish between "drug-like" and "nondrug-like" molecules? , 1998, Journal of medicinal chemistry.

[11]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[12]  H. van de Waterbeemd,et al.  ADMET in silico modelling: towards prediction paradise? , 2003, Nature reviews. Drug discovery.

[13]  H. Kubinyi,et al.  A scoring scheme for discriminating between drugs and nondrugs. , 1998, Journal of medicinal chemistry.

[14]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[15]  S. Wold,et al.  The kernel algorithm for PLS , 1993 .

[16]  Kristin P. Bennett,et al.  An Optimization Perspective on Kernel Partial Least Squares Regression , 2003 .

[17]  Geoffrey I. Webb,et al.  Solving Regression Problems Using Competitive Ensemble Models , 2002, Australian Joint Conference on Artificial Intelligence.

[18]  Roman Rosipal,et al.  Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space , 2002, J. Mach. Learn. Res..

[19]  Li Di,et al.  Profiling drug-like properties in discovery research. , 2003, Current opinion in chemical biology.

[20]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[21]  Arup K. Ghose,et al.  Atomic physicochemical parameters for three dimensional structure directed quantitative structure-activity relationships. 4. Additional parameters for hydrophobic and dispersive interactions and their application for an automated superposition of certain naturally occurring nucleoside antibiotics , 1989, J. Chem. Inf. Comput. Sci..

[22]  Mark A. Murcko,et al.  Virtual screening : an overview , 1998 .

[23]  Paul A. Smith,et al.  Comparison of Linear and Nonlinear Classification Algorithms for the Prediction of Drug and Chemical Metabolism by Human UDP-Glucuronosyltransferase Isoforms , 2003, J. Chem. Inf. Comput. Sci..

[24]  I. Muegge Selection criteria for drug‐like compounds , 2003, Medicinal research reviews.

[25]  BMC Bioinformatics , 2005 .

[26]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[27]  D. Newman,et al.  Natural products as sources of new drugs over the last 25 years. , 2007, Journal of natural products.

[28]  P. Labute A widely applicable set of descriptors. , 2000, Journal of molecular graphics & modelling.

[29]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[30]  D A Smith,et al.  Pharmacokinetics and metabolism in early drug discovery. , 1999, Current opinion in chemical biology.

[31]  Markus Wagener,et al.  Potential Drugs and Nondrugs: Prediction and Identification of Important Structural Features , 2000, J. Chem. Inf. Comput. Sci..

[32]  W Patrick Walters,et al.  Prediction of 'drug-likeness'. , 2002, Advanced drug delivery reviews.

[33]  John Bradshaw,et al.  Identification of Biological Activity Profiles Using Substructural Analysis and Genetic Algorithms , 1998, J. Chem. Inf. Comput. Sci..

[34]  A. Shirwaikar,et al.  Recent Trends in Drug-Likeness Prediction : A Comprehensive Review of In Silico Methods , 2022 .

[35]  Robert Bywater,et al.  Improving the Odds in Discriminating "Drug-like" from "Non Drug-like" Compounds , 2000, J. Chem. Inf. Comput. Sci..