Identification of vasodilators from molecular descriptors by machine learning methods

Abstract Vasodilators have been extensively used in the treatment of various vascular diseases. With the aim of developing the accurate computational models for identifying vasodilators of diverse structures, several machine learning methods, such as C4.5 decision tree (C4.5 DT), k-nearest neighbor (k-NN), and support vector machine (SVM), were explored in this work. These identification models were trained by using 198 three-dimensional molecular descriptors and a group of 635 compounds including 308 vasodilators and 327 non-vasodilators, in which feature selection was conducted to optimize the training models and select the most appropriate descriptors for identifying the vasodilators. An independent validation set of 74 vasodilators and 87 non-vasodilators was subsequently used to evaluate the performance of the developed identification models. The identification rates of these models are in the range of 78.38% –97.30% for vasodilators and 83.91%–86.21% for non-vasodilators. Our investigation reveals that the explored machine learning methods, especially SVM, are potentially useful for the identification of vasodilators.

[1]  R. Czerminski,et al.  Use of Support Vector Machine in Pattern Classification: Application to QSAR Studies , 2001 .

[2]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[3]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[4]  B. K. Park,et al.  Mechanism-based design of parasite-targeted artemisinin derivatives: synthesis and antimalarial activity of new diamine containing analogues. , 2002, Journal of medicinal chemistry.

[5]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings , 1997 .

[6]  E. Barreiro,et al.  Synthesis and vasodilatory activity of new N-acylhydrazone derivatives, designed as LASSBio-294 analogues. , 2005, Bioorganic & medicinal chemistry.

[7]  P. Carrupt,et al.  NO-donor COX-2 inhibitors. New nitrooxy-substituted 1,5-diarylimidazoles endowed with COX-2 inhibitory and vasodilator properties. , 2007, Journal of medicinal chemistry.

[8]  M Pastor,et al.  VolSurf: a new tool for the pharmacokinetic optimization of lead compounds. , 2000, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[9]  Bernard De Baets,et al.  Feature subset selection for splice site prediction , 2002, ECCB.

[10]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[11]  S. Vilar,et al.  Quantitative Structure Vasodilatory Activity Relationship – Synthesis and “In Silico” and “In Vitro” Evaluation of Resveratrol‐Coumarin Hybrids , 2007 .

[12]  R. Kurumbail,et al.  Structure-based drug design of pyrazinone antithrombotics as selective inhibitors of the tissue factor VIIa complex. , 2003, Bioorganic & medicinal chemistry letters.

[13]  Z R Li,et al.  Prediction of genotoxicity of chemical compounds by statistical learning methods. , 2005, Chemical research in toxicology.

[14]  Bernard F. Buxton,et al.  Support Vector Machines in Combinatorial Chemistry , 2001 .

[15]  Bernard F. Buxton,et al.  Drug Design by Machine Learning: Support Vector Machines for Pharmaceutical Data Analysis , 2001, Comput. Chem..

[16]  Robert M Califf,et al.  Cardiovascular disease on a global scale: defining the path forward for research and practice. , 2007, European heart journal.

[17]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[18]  C W Yap,et al.  Classification of a diverse set of Tetrahymena pyriformis toxicity chemical compounds from molecular descriptors by statistical learning methods. , 2006, Chemical research in toxicology.

[19]  T. Liljefors,et al.  The use of a pharmacophore model for identification of novel ligands for the benzodiazepine binding site of the GABAA receptor. , 2004, Journal of molecular graphics & modelling.

[20]  P. Petluru,et al.  New approaches to drug discovery and development: a mechanism-based approach to pharmaceutical research and its application to BNP7787, a novel chemoprotective agent , 2003, Cancer Chemotherapy and Pharmacology.

[21]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[22]  M. Raizada,et al.  Structure-Based Identification of Small-Molecule Angiotensin-Converting Enzyme 2 Activators as Novel Antihypertensive Agents , 2008, Hypertension.

[23]  Z. R. Li,et al.  A support vector machines approach for virtual screening of active compounds of single and multiple mechanisms from large libraries at an improved hit-rate and enrichment factor. , 2008, Journal of molecular graphics & modelling.

[24]  F. Uckun,et al.  Structure-based drug design of non-nucleoside inhibitors for wild-type and drug-resistant HIV reverse transcriptase. , 2000, Biochemical pharmacology.

[25]  Xin Chen,et al.  Effect of Molecular Descriptor Feature Selection in Support Vector Machine Classification of Pharmacokinetic and Toxicological Properties of Chemical Agents , 2004, J. Chem. Inf. Model..

[26]  R. Carrón,et al.  Vasorelaxant activity of phthalazinones and related compounds. , 2006, Bioorganic & medicinal chemistry letters.

[27]  H H Lin,et al.  Prediction of factor Xa inhibitors by machine learning methods. , 2007, Journal of molecular graphics & modelling.

[28]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[29]  C. J. Huberty,et al.  Applied Discriminant Analysis , 1994 .

[30]  Takako Aizawa,et al.  Quantitative structure-activity relationships for estrogen receptor binding affinity of phenolic chemicals. , 2003, Water research.

[31]  L. Santana,et al.  Synthesis and vasorelaxant activity of new coumarin and furocoumarin derivatives. , 2002, Bioorganic & medicinal chemistry letters.

[32]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[33]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[34]  J. Lehmann,et al.  NO donors. Part 16: investigations on structure-activity relationships of organic mononitrates reveal 2-nitrooxyethylammoniumnitrate as a high potent vasodilator. , 2007, Bioorganic & medicinal chemistry letters.

[35]  L. Lazzarato,et al.  NO-donor phenols: a new class of products endowed with antioxidant and vasodilator properties. , 2006, Journal of medicinal chemistry.

[36]  H. Yu,et al.  Discovering compact and highly discriminative features or combinations of drug activities using support vector machines , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[37]  Q Xie,et al.  Structure-activity relationships for a large diverse set of natural, synthetic, and environmental estrogens. , 2001, Chemical research in toxicology.

[38]  Min Wang,et al.  Prediction of antibacterial compounds by machine learning approaches , 2009, J. Comput. Chem..

[39]  J. F. Wang,et al.  Prediction of P-Glycoprotein Substrates by a Support Vector Machine Approach , 2004, J. Chem. Inf. Model..

[40]  C. O'connor,et al.  Depression and cardiovascular disease: mechanisms of interaction , 2003, Biological Psychiatry.

[41]  F. Sanz,et al.  Anchor-GRIND: filling the gap between standard 3D QSAR and the GRid-INdependent descriptors. , 2005 .

[42]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[43]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[44]  Brian K. Shoichet,et al.  Virtual screening of chemical libraries , 2004, Nature.

[45]  J E Roulston,et al.  Screening with tumor markers , 2002, Molecular biotechnology.

[46]  X. Chen,et al.  SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence , 2003, Nucleic Acids Res..

[47]  Paola Gramatica,et al.  QSAR prediction of estrogen activity for a large set of diverse chemicals under the guidance of OECD principles. , 2006, Chemical research in toxicology.

[48]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[49]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[50]  J. Gibbs Mechanism-based target identification and drug discovery in cancer research. , 2000, Science.

[51]  Cesare Furlanello,et al.  An accelerated procedure for recursive feature ranking on microarray data , 2003, Neural Networks.

[52]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[53]  P. Furet,et al.  Use of a pharmacophore model for the design of EGF-R tyrosine kinase inhibitors: 4-(phenylamino)pyrazolo[3,4-d]pyrimidines. , 1997, Journal of medicinal chemistry.

[54]  Juan J Perez,et al.  Managing molecular diversity. , 2005, Chemical Society reviews.

[55]  F. Messerli Vasodilatory edema: a common side effect of antihypertensive therapy. , 2001, Current cardiology reports.

[56]  Z. R. Li,et al.  Prediction of estrogen receptor agonists and characterization of associated molecular descriptors by statistical learning methods. , 2006, Journal of molecular graphics & modelling.

[57]  W. Pettinger,et al.  Side Effects of Vasodilator Therapy , 1988, HYPERTENSION.