ksrMKL: a novel method for identification of kinase–substrate relationships using multiple kernel learning

Phosphorylation exerts a crucial role in multiple biological cellular processes which is catalyzed by protein kinases and closely related to many diseases. Identification of kinase–substrate relationships is important for understanding phosphorylation and provides a fundamental basis for further disease-related research and drug design. In this study, we develop a novel computational method to identify kinase–substrate relationships based on multiple kernel learning. The comparative analysis is based on a 10-fold cross-validation process and the dataset collected from the Phospho.ELM database. The results show that ksrMKL is greatly improved in various measures when compared with the single kernel support vector machine. Furthermore, with an independent test dataset extracted from the PhosphoSitePlus database, we compare ksrMKL with two existing kinase–substrate relationship prediction tools, namely iGPS and PKIS. The experimental results show that ksrMKL has better prediction performance than these existing tools.

[1]  Ivan G. Costa,et al.  A multiple kernel learning algorithm for drug-target interaction prediction , 2016, BMC Bioinformatics.

[2]  Yu Xue,et al.  GPS: a novel group-based phosphorylation predicting and scoring method. , 2004, Biochemical and biophysical research communications.

[3]  Md. Al Mehedi Hasan,et al.  Protein subcellular localization prediction using multiple kernel learning based support vector machine. , 2017, Molecular bioSystems.

[4]  Minghui Wang,et al.  Inferring Disease Associated Phosphorylation Sites via Random Walk on Multi-Layer Heterogeneous Network , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[5]  Dong Xu,et al.  Musite, a Tool for Global Prediction of General and Kinase-specific Phosphorylation Sites* , 2010, Molecular & Cellular Proteomics.

[6]  Jorng-Tzong Horng,et al.  KinasePhos: a web tool for identifying protein kinase-specific phosphorylation sites , 2005, Nucleic Acids Res..

[7]  Farida Zehraoui,et al.  Towards a piRNA prediction using multiple kernel fusion and support vector machine , 2014, Bioinform..

[8]  Nikolaj Blom,et al.  Kinase-specific prediction of protein phosphorylation sites. , 2009, Methods in molecular biology.

[9]  Yi Shen,et al.  PKIS: computational identification of protein kinases for experimentally discovered protein phosphorylation sites , 2013, BMC Bioinformatics.

[10]  L. Iakoucheva,et al.  The importance of intrinsic disorder for protein phosphorylation. , 2004, Nucleic acids research.

[11]  Yu Xue,et al.  GPS 2.0, a Tool to Predict Kinase-specific Phosphorylation Sites in Hierarchy *S , 2008, Molecular & Cellular Proteomics.

[12]  Bin Zhang,et al.  PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse , 2011, Nucleic Acids Res..

[13]  Damian Szklarczyk,et al.  The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored , 2010, Nucleic Acids Res..

[14]  Anthony J. Kusalik,et al.  Computational phosphorylation site prediction in plants using random forests and organism-specific instance weights , 2013, Bioinform..

[15]  Fabio Aiolli,et al.  EasyMKL: a scalable multiple kernel learning algorithm , 2015, Neurocomputing.

[16]  Ao Li,et al.  Prediction of post-translational modification sites using multiple kernel support vector machine , 2017, PeerJ.

[17]  Yu Xue,et al.  Computational Prediction of Post-Translational Modification Sites in Proteins , 2011 .

[18]  T. Hunter,et al.  The Protein Kinase Complement of the Human Genome , 2002, Science.

[19]  Bernhard Schölkopf,et al.  Comparing support vector machines with Gaussian kernels to radial basis function classifiers , 1997, IEEE Trans. Signal Process..

[20]  P. Ortiz de Montellano,et al.  Protein kinase Akt/PKB phosphorylates heme oxygenase‐1 in vitro and in vivo , 2004, FEBS letters.

[21]  Xinning Jiang,et al.  Reversed-phase-reversed-phase liquid chromatography approach with high orthogonality for multidimensional separation of phosphopeptides. , 2010, Analytical chemistry.

[22]  Tony Pawson,et al.  NetworKIN: a resource for exploring cellular phosphorylation networks , 2007, Nucleic Acids Res..

[23]  John Shawe-Taylor,et al.  A multimodal multiple kernel learning approach to Alzheimer's disease detection , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[24]  Fabio Aiolli,et al.  Easy multiple kernel learning , 2014, ESANN.

[25]  Allegra Via,et al.  Phospho.ELM: a database of phosphorylation sites—update 2008 , 2008, Nucleic Acids Res..

[26]  Tingting Li,et al.  Identifying Human Kinase-Specific Protein Phosphorylation Sites by Integrating Heterogeneous Information from Various Sources , 2010, PloS one.

[27]  M. Bajpai,et al.  Fostamatinib, a Syk inhibitor prodrug for the treatment of inflammatory diseases. , 2009, IDrugs : the investigational drugs journal.

[28]  Xiaoyi Xu,et al.  A novel method for predicting post-translational modifications on serine and threonine sites by using site-modification network profiles. , 2015, Molecular bioSystems.

[29]  J. Schlessinger,et al.  Cell Signaling by Receptor Tyrosine Kinases , 2000, Cell.

[30]  T. Hunter,et al.  Signaling—2000 and Beyond , 2000, Cell.

[31]  M. Mann,et al.  Ultradeep human phosphoproteome reveals a distinct regulatory nature of Tyr and Ser/Thr-based signaling. , 2014, Cell reports.

[32]  Steven P. Gygi,et al.  Large-scale phosphorylation analysis of mouse liver , 2007, Proceedings of the National Academy of Sciences.

[33]  Cathryn M. Gould,et al.  Phospho.ELM: a database of phosphorylation sites—update 2011 , 2010, Nucleic acids research.

[34]  Ao Li,et al.  Improving the performance of protein kinase identification via high dimensional protein-protein interactions and substrate structure data. , 2014, Molecular bioSystems.

[35]  Chen Peng,et al.  Improve Glioblastoma Multiforme Prognosis Prediction by Using Feature Selection and Multiple Kernel Learning , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[36]  M. Mann,et al.  PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites , 2007, Genome Biology.

[37]  G. Uhl,et al.  Phosphatidylinositol 3-Kinase, Protein Kinase C, and MEK1/2 Kinase Regulation of Dopamine Transporters (DAT) Require N-terminal DAT Phosphoacceptor Sites* , 2003, Journal of Biological Chemistry.

[38]  Juho Rousu,et al.  Metabolite identification through multiple kernel learning on fragmentation trees , 2014, Bioinform..

[39]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[40]  Anthony J. Kusalik,et al.  Computational prediction of eukaryotic phosphorylation sites , 2011, Bioinform..

[41]  Katsura Asano,et al.  Eukaryotic Translation Initiation Factor 5 Is Critical for Integrity of the Scanning Preinitiation Complex and Accurate Control of GCN4 Translation , 2005, Molecular and Cellular Biology.

[42]  Yves Grandvalet,et al.  More efficiency in multiple kernel learning , 2007, ICML '07.

[43]  Livia Perfetto,et al.  MINT, the molecular interaction database: 2009 update , 2009, Nucleic Acids Res..

[44]  He Zhang,et al.  Kinase Identification with Supervised Laplacian Regularized Least Squares , 2015, PloS one.

[45]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[46]  Hongyang Wang,et al.  Systematic Analysis of Protein Phosphorylation Networks From Phosphoproteomic Data* , 2012, Molecular & Cellular Proteomics.

[47]  Yi Shen,et al.  Prediction of protein kinase-specific phosphorylation sites in hierarchical structure using functional information and random forest , 2014, Amino Acids.

[48]  H. Zou,et al.  Phosphoproteome analysis of human liver tissue by long‐gradient nanoflow LC coupled with multiple stage MS analysis , 2010, Electrophoresis.