Identification of phosphorylation sites using a hybrid classifier ensemble approach

Protein phosphorylation is an important step in many biological processes, such as cell cycles, membrane transport, apoptosis, and so on. We design a new classifier ensemble approach called Bagging-Adaboost Ensemble (BAE) for the prediction of eukaryotic protein phosphorylation sites, which incorporates the bagging technique and the adaboost technique into the classifier framework to improve the accuracy, stability and robustness of the final result. To our knowledge, this is the first time in which the ensemble approach is applied to predict phosphorylation sites. Our prediction system based on BAE focuses on five kinase families: CDK, CK2, MAPK, PKA, and PKC. BAE achieves good performance in six families, and the accuracies of the prediction system for these families are 84.7%, 87.4%, 85.5%, 85.2%, and 82.3% respectively.