In silico prediction of spleen tyrosine kinase inhibitors using machine learning approaches and an optimized molecular descriptor subset generated by recursive feature elimination method

We tested four machine learning methods, support vector machine (SVM), k-nearest neighbor, back-propagation neural network and C4.5 decision tree for their capability in predicting spleen tyrosine kinase (Syk) inhibitors by using 2592 compounds which are more diverse than those in other studies. The recursive feature elimination method was used for improving prediction performance and selecting molecular descriptors responsible for distinguishing Syk inhibitors and non-inhibitors. Among four machine learning models, SVM produces the best performance at 99.18% for inhibitors and 98.82% for non-inhibitors, respectively, indicating that the SVM is potentially useful for facilitating the discovery of Syk inhibitors.

[1]  Xiao-Ling Cockcroft,et al.  Purine derivative inhibitors of protein kinase syk-tyrosi , 2000 .

[2]  S. Bhagwat,et al.  Kinase inhibitors for the treatment of inflammatory and autoimmune disorders , 2008, Purinergic Signalling.

[3]  K. Miyazawa,et al.  A novel Syk family kinase inhibitor: design, synthesis, and structure-activity relationship of 1,2,4-triazolo[4,3-c]pyrimidine and 1,2,4-triazolo[1,5-c]pyrimidine derivatives. , 2008, Bioorganic & medicinal chemistry.

[4]  Ekaterina Gordeeva,et al.  Traditional topological indexes vs electronic, geometrical, and combined molecular descriptors in QSAR/QSPR research , 1993, J. Chem. Inf. Comput. Sci..

[5]  Michael H Weisman,et al.  New therapies for treatment of rheumatoid arthritis , 2007, The Lancet.

[6]  Z. R. Li,et al.  Prediction of estrogen receptor agonists and characterization of associated molecular descriptors by statistical learning methods. , 2006, Journal of molecular graphics & modelling.

[7]  H. Yu,et al.  Discovering compact and highly discriminative features or combinations of drug activities using support vector machines , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[8]  M. Genovese,et al.  An oral spleen tyrosine kinase (Syk) inhibitor for rheumatoid arthritis. , 2010, The New England journal of medicine.

[9]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[10]  Nello Cristianini,et al.  Editorial: Kernel Methods: Current Research and Future Directions , 2002, Machine-mediated learning.

[11]  D. Kimpel,et al.  Double-Blind Randomized Controlled Clinical Trial of the Interleukin-6 Receptor Antagonist, Tocilizumab, in European Patients With Rheumatoid Arthritis Who Had an Incomplete Response to Methotrexate , 2007 .

[12]  Mauro Riccaboni,et al.  Spleen tyrosine kinases: biology, therapeutic targets and drugs. , 2010, Drug discovery today.

[13]  Kaleb Michaud,et al.  Epidemiological studies in incidence, prevalence, mortality, and comorbidity of the rheumatic diseases , 2009, Arthritis research & therapy.

[14]  Min Wang,et al.  Prediction of antibacterial compounds by machine learning approaches , 2009, J. Comput. Chem..

[15]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[16]  E. Schaefer,et al.  Kinetic assay for characterization of spleen tyrosine kinase activity and inhibition with recombinant kinase and crude cell lysates. , 2009, Analytical biochemistry.

[17]  Masaya Orita,et al.  Synthetic studies on novel Syk inhibitors. Part 1: Synthesis and structure-activity relationships of pyrimidine-5-carboxamide derivatives. , 2005, Bioorganic & medicinal chemistry.

[18]  Renee L DesJarlais,et al.  Discovery and SAR of novel Naphthyridines as potent inhibitors of spleen tyrosine kinase (SYK). , 2003, Bioorganic & medicinal chemistry letters.

[19]  R. Geahlen,et al.  Design, synthesis, and biological evaluation of a series of lavendustin A analogues that inhibit EGFR and Syk tyrosine kinases, as well as tubulin polymerization. , 2001, Journal of medicinal chemistry.

[20]  Xue-Gang Yang,et al.  In silico prediction and screening of γ‐secretase inhibitors by molecular descriptors and machine learning methods , 2009, J. Comput. Chem..

[21]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[22]  J E Roulston,et al.  Screening with tumor markers , 2002, Molecular biotechnology.

[23]  Ying Xue,et al.  Identifying hERG Potassium Channel Inhibitors by Machine Learning Methods , 2008 .

[24]  P. Emery,et al.  Double-blind randomized controlled clinical trial of the interleukin-6 receptor antagonist, tocilizumab, in European patients with rheumatoid arthritis who had an incomplete response to methotrexate. , 2006, Arthritis and rheumatism.

[25]  H H Lin,et al.  Prediction of factor Xa inhibitors by machine learning methods. , 2007, Journal of molecular graphics & modelling.

[26]  Igor V. Tetko,et al.  Neural network studies, 1. Comparison of overfitting and overtraining , 1995, J. Chem. Inf. Comput. Sci..

[27]  Z R Li,et al.  Prediction of genotoxicity of chemical compounds by statistical learning methods. , 2005, Chemical research in toxicology.

[28]  J. Vencovský,et al.  Prospective new biological therapies for rheumatoid arthritis. , 2009, Autoimmunity reviews.

[29]  Wei Lv,et al.  Prediction of novel and selective TNF-alpha converting enzyme (TACE) inhibitors and characterization of correlative molecular descriptors by machine learning approaches. , 2009, Journal of molecular graphics & modelling.

[30]  Cesare Furlanello,et al.  An accelerated procedure for recursive feature ranking on microarray data , 2003, Neural Networks.

[31]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[32]  Zhi-Wei Cao,et al.  Effect of Selection of Molecular Descriptors on the Prediction of Blood-Brain Barrier Penetrating and Nonpenetrating Agents by Statistical Learning Methods , 2005, J. Chem. Inf. Model..

[33]  Yu-Quan Wei,et al.  Pharmacophore modeling study based on known spleen tyrosine kinase inhibitors together with virtual screening for identifying novel inhibitors. , 2009, Bioorganic & medicinal chemistry letters.

[34]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[35]  C W Yap,et al.  Classification of a diverse set of Tetrahymena pyriformis toxicity chemical compounds from molecular descriptors by statistical learning methods. , 2006, Chemical research in toxicology.

[36]  K. Chou,et al.  Prediction of Antimicrobial Peptides Based on Sequence Alignment and Feature Selection Methods , 2011, PloS one.

[37]  C. J. Huberty,et al.  Applied Discriminant Analysis , 1994 .

[38]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[39]  Z. R. Li,et al.  A support vector machines approach for virtual screening of active compounds of single and multiple mechanisms from large libraries at an improved hit-rate and enrichment factor. , 2008, Journal of molecular graphics & modelling.

[40]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[41]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[42]  M. Weinblatt,et al.  Treatment of rheumatoid arthritis with a Syk kinase inhibitor: a twelve-week, randomized, placebo-controlled trial. , 2008, Arthritis and rheumatism.

[43]  Masaya Orita,et al.  Corrigendum to “Synthetic studies on novel Syk inhibitors. Part 1: Synthesis and structure–activity relationships of pyrimidine-5-carboxamide derivatives” [Bioorg. Med. Chem. 13 (2005) 4936–4951] , 2005 .

[44]  T. A. Andrea,et al.  Applications of neural networks in quantitative structure-activity relationships of dihydrofolate reductase inhibitors. , 1991, Journal of medicinal chemistry.

[45]  Subhash C Basak,et al.  Prediction of Anticancer Activity of 2-phenylindoles: Comparative Molecular Field Analysis Versus Ridge Regression using Mathematical Molecular Descriptors. , 2010, Acta chimica Slovenica.

[46]  G. Firestein Evolving concepts of rheumatoid arthritis , 2003, Nature.

[47]  L. Moreland,et al.  Cytokines as Targets for Anti‐inflammatory Agents , 2009, Annals of the New York Academy of Sciences.

[48]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[49]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[50]  Young Mi Kim,et al.  Curcumin, a constituent of curry, suppresses IgE-mediated allergic response and mast cell activation at the level of Syk. , 2008, The Journal of allergy and clinical immunology.

[51]  Giles M. Foody,et al.  Feature Selection for Classification of Hyperspectral Data by SVM , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[52]  Charles L. Cywin Discovery and SAR of Novel [1,6]Naphthyridines as Potent Inhibitors of Spleen Tyrosine Kinase (SYK). , 2003 .

[53]  K. Miyazawa,et al.  Structure-activity relationship studies of imidazo[1,2-c]pyrimidine derivatives as potent and orally effective Syk family kinases inhibitors. , 2008, Bioorganic & medicinal chemistry.

[54]  Laure Gossec,et al.  Current evidence for the management of rheumatoid arthritis with synthetic disease-modifying antirheumatic drugs: a systematic literature review informing the EULAR recommendations for the management of rheumatoid arthritis , 2010, Annals of the rheumatic diseases.

[55]  Paul J. Cox Potent Small Molecule Inhibitors of Spleen Tyrosine Kinase (Syk). , 2003 .

[56]  H Matter,et al.  Random or rational design? Evaluation of diverse compound subsets from chemical structure databases. , 1998, Journal of medicinal chemistry.

[57]  Xin Chen,et al.  Effect of Molecular Descriptor Feature Selection in Support Vector Machine Classification of Pharmacokinetic and Toxicological Properties of Chemical Agents , 2004, J. Chem. Inf. Model..

[58]  Ying Xue,et al.  Prediction of P‐Glycoprotein Substrates by a Support Vector Machine Approach. , 2004 .

[59]  Jian Wang,et al.  Discovery and SAR of novel 4-thiazolyl-2-phenylaminopyrimidines as potent inhibitors of spleen tyrosine kinase (SYK). , 2008, Bioorganic & medicinal chemistry letters.

[60]  R S Bohacek,et al.  Discovery of potent and selective SH2 inhibitors of the tyrosine kinase ZAP-70. , 1999, Journal of medicinal chemistry.

[61]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[62]  Gregg D. Wilensky,et al.  Neural Network Studies , 1993 .

[63]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[64]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[65]  Charles Peterfy,et al.  An oral Syk kinase inhibitor in the treatment of rheumatoid arthritis: a three-month randomized, placebo-controlled, phase II study in patients with active rheumatoid arthritis that did not respond to biologic agents. , 2011, Arthritis and rheumatism.

[66]  Yu Zong Chen,et al.  Prediction of Cytochrome P450 3A4, 2D6, and 2C9 Inhibitors and Substrates by Using Support Vector Machines , 2005, J. Chem. Inf. Model..

[67]  Maarten Boers,et al.  Syk kinase inhibitors for rheumatoid arthritis: trials and tribulations. , 2011, Arthritis and rheumatism.

[68]  A. Tristano,et al.  Tyrosine kinases as targets in rheumatoid arthritis. , 2009, International immunopharmacology.

[69]  Ying Xue,et al.  Identification of vasodilators from molecular descriptors by machine learning methods , 2010 .

[70]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[71]  K. Miyazawa,et al.  Structure-activity relationship studies of 5-benzylaminoimidazo[1,2-c]pyrimidine-8-carboxamide derivatives as potent, highly selective ZAP-70 kinase inhibitors. , 2009, Bioorganic & medicinal chemistry.

[72]  Bálint Balázs,et al.  Genetic deficiency of Syk protects mice from autoantibody-induced arthritis , 2010, Arthritis and rheumatism.

[73]  Jonathan D. Hirst,et al.  Inhibition of the Tyrosine Kinase, Syk, Analyzed by Stepwise Nonparametric Regression , 2005, J. Chem. Inf. Model..

[74]  Frederick Wolfe,et al.  Rheumatoid arthritis , 2010, The Lancet.

[75]  Wei-Ke Lv,et al.  Prediction of acetylcholinesterase inhibitors and characterization of correlative molecular descriptors by machine learning methods. , 2010, European journal of medicinal chemistry.

[76]  N. Bodor,et al.  Neural network studies: Part 3. Prediction of partition coefficients , 1994 .