Prediction of 492 human protein kinase substrate specificities

BackgroundComplex intracellular signaling networks monitor diverse environmental inputs to evoke appropriate and coordinated effector responses. Defective signal transduction underlies many pathologies, including cancer, diabetes, autoimmunity and about 400 other human diseases. Therefore, there is high impetus to define the composition and architecture of cellular communications networks in humans. The major components of intracellular signaling networks are protein kinases and protein phosphatases, which catalyze the reversible phosphorylation of proteins. Here, we have focused on identification of kinase-substrate interactions through prediction of the phosphorylation site specificity from knowledge of the primary amino acid sequence of the catalytic domain of each kinase.ResultsThe presented method predicts 488 different kinase catalytic domain substrate specificity matrices in 478 typical and 4 atypical human kinases that rely on both positive and negative determinants for scoring individual phosphosites for their suitability as kinase substrates. This represents a marked advancement over existing methods such as those used in NetPhorest (179 kinases in 76 groups) and NetworKIN (123 kinases), which consider only positive determinants for kinase substrate prediction. Comparison of our predicted matrices with experimentally-derived matrices from about 9,000 known kinase-phosphosite substrate pairs revealed a high degree of concordance with the established preferences of about 150 well studied protein kinases. Furthermore for many of the better known kinases, the predicted optimal phosphosite sequences were more accurate than the consensus phosphosite sequences inferred by simple alignment of the phosphosites of known kinase substrates.ConclusionsApplication of this improved kinase substrate prediction algorithm to the primary structures of over 23, 000 proteins encoded by the human genome has permitted the identification of about 650, 000 putative phosphosites, which are posted on the open source PhosphoNET website (http://www.phosphonet.ca).

[1]  P. Bork,et al.  Systematic Discovery of In Vivo Phosphorylation Networks , 2007, Cell.

[2]  Michael B. Yaffe,et al.  Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs , 2003, Nucleic Acids Res..

[3]  Allegra Via,et al.  Phospho.ELM: a database of phosphorylation sites—update 2008 , 2007, Nucleic Acids Res..

[4]  Allegra Via,et al.  Phospho.ELM: a database of phosphorylation sites—update 2008 , 2008, Nucleic Acids Res..

[5]  S. Pelech,et al.  Dimerization in protein kinase signaling , 2006, Journal of biology.

[6]  Bermseok Oh,et al.  Prediction of phosphorylation sites using SVMs , 2004, Bioinform..

[7]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[8]  Shuli Kang,et al.  Meta-prediction of phosphorylation sites with weighted voting and restricted grid search parameter selection , 2008, Nucleic acids research.

[9]  Bostjan Kobe,et al.  Predikin and PredikinDB: a computational framework for the prediction of protein kinase peptide specificity and an associated database of phosphorylation sites , 2008, BMC Bioinformatics.

[10]  Koenraad Van Leemput,et al.  Prediction of kinase-specific phosphorylation sites using conditional random fields , 2008, Bioinform..

[11]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[12]  Nikolaj Blom,et al.  Kinase-specific prediction of protein phosphorylation sites. , 2009, Methods in molecular biology.

[13]  N. Blom,et al.  Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. , 1999, Journal of molecular biology.

[14]  Luquan Wang,et al.  Human members of the eukaryotic protein kinase family , 2002, Genome Biology.

[15]  Tony Pawson,et al.  NetworKIN: a resource for exploring cellular phosphorylation networks , 2007, Nucleic Acids Res..

[16]  Christian von Mering,et al.  STRING 8—a global view on proteins and their functional interactions in 630 organisms , 2008, Nucleic Acids Res..

[17]  Ján Manuch,et al.  Prediction of human protein kinase substrate specificities , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[18]  P. Bork,et al.  Linear Motif Atlas for Phosphorylation-Dependent Signaling , 2008, Science Signaling.