JUPred_MLP: Prediction of Phosphorylation Sites Using a Consensus of MLP Classifiers

Post-translational modification is the attachment of biochemical functional groups after translation from mRNA. Among the different post translational modifications, phosphorylation happens to be one of the most important types which is responsible for important cellular operations. In this research work, we have used multilayer perceptron (MLP) to predict protein residues which are phosphorylated. As features, we have used position-specific scoring matrices (PSSM) generated by PSI-BLAST algorithm for each protein sequence after three runs against 90 % redundancy reduced Uniprot database. For an independent set of 141 proteins, our system was able to provide the best AUC score for 36 proteins, highest for any other predictor. Our system achieved an AUC score of 0.7239 for all the protein sequences combined, which is comparable to the state-of-the art predictors.

[1]  Subhadip Basu,et al.  Big Data Analytics and Its Prospects in Computational Proteomics , 2015 .

[2]  N. Blom,et al.  Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. , 1999, Journal of molecular biology.

[3]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[4]  N. Blom,et al.  Identification of phosphorylation sites in protein kinase A substrates using artificial neural networks and mass spectrometry. , 2004, Journal of proteome research.

[5]  Bermseok Oh,et al.  Prediction of phosphorylation sites using SVMs , 2004, Bioinform..

[6]  Anthony J. Kusalik,et al.  Computational phosphorylation site prediction in plants using random forests and organism-specific instance weights , 2013, Bioinform..

[7]  Subhadip Basu,et al.  AMS 3.0: prediction of post-translational modifications , 2010, BMC Bioinformatics.

[8]  Yu Xue,et al.  GPS 2.0, a Tool to Predict Kinase-specific Phosphorylation Sites in Hierarchy *S , 2008, Molecular & Cellular Proteomics.

[9]  Ying Gao,et al.  Bioinformatics Applications Note Sequence Analysis Cd-hit Suite: a Web Server for Clustering and Comparing Biological Sequences , 2022 .

[10]  Subhadip Basu,et al.  AMS 4.0: consensus prediction of post-translational modifications in protein sequences , 2012, Amino Acids.

[11]  Yu Xue,et al.  GPS 2.1: enhanced prediction of kinase-specific phosphorylation sites with an algorithm of motif length selection. , 2011, Protein engineering, design & selection : PEDS.

[12]  Ashis Kumer Biswas,et al.  Machine learning approach to predict protein phosphorylation sites by incorporating evolutionary information , 2010, BMC Bioinformatics.

[13]  Yu Xue,et al.  A summary of computational resources for protein phosphorylation. , 2010, Current protein & peptide science.

[14]  Bo Yao,et al.  PhosphoSVM: prediction of phosphorylation sites by integrating various protein sequence attributes with a support vector machine , 2014, Amino Acids.

[15]  Dong Xu,et al.  Musite, a Tool for Global Prediction of General and Kinase-specific Phosphorylation Sites* , 2010, Molecular & Cellular Proteomics.

[16]  Anthony J. Kusalik,et al.  Computational prediction of eukaryotic phosphorylation sites , 2011, Bioinform..

[17]  Leszek Rychlewski,et al.  ELM server: a new resource for investigating short functional sites in modular eukaryotic proteins , 2003, Nucleic Acids Res..