Predicting Multisite Protein Sub-cellular Locations Based on Correlation Coefficient

With the development of proteomics and cell biology, protein sub-cellular location has become a hot topic in bioinformatics. As the time goes on, more and more researchers make great efforts on studying protein sub-cellular location. But they only do research on single-site protein sub-cellular location. However, some proteins can belong to two or more sub-cellulars. So, we should transfer the line of sight to multisite protein sub-cellular location. In this article, we use Virus-mPLoc data set and choose pseudo amino acid composition and correlation coefficient two effective feature extraction methods. Then, putting these features into multi-label k-nearest neighbor classifier to predict protein sub-cellular location. The experiment proves that this method is reasonable and the precision reached 68.65% through the Jack-knife test.

[1]  Xiaobo Zhou,et al.  Systemic modeling myeloma-osteoclast interactions under normoxic/hypoxic condition using a novel computational approach , 2015, Scientific Reports.

[2]  Ying Ju,et al.  Review of Protein Subcellular Localization Prediction , 2014 .

[3]  K. Chou,et al.  PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition. , 2008, Analytical biochemistry.

[4]  K. Chou,et al.  iLoc-Virus: a multi-label learning classifier for identifying the subcellular localization of virus proteins with both single and multiple sites. , 2011, Journal of theoretical biology.

[5]  Qian-zhong Li,et al.  Predict mycobacterial proteins subcellular locations by incorporating pseudo-average chemical shift into the general form of Chou's pseudo amino acid composition. , 2012, Journal of theoretical biology.

[6]  K. Chou,et al.  iLoc-Animal: a multi-label learning classifier for predicting subcellular localization of animal proteins. , 2013, Molecular bioSystems.

[7]  Jing-Qi Yuan,et al.  Predicting protein subchloroplast locations with both single and multiple sites via three different modes of Chou's pseudo amino acid compositions. , 2013, Journal of theoretical biology.

[8]  Lukasz A. Kurgan,et al.  Prediction of protein structural class using novel evolutionary collocation‐based sequence representation , 2008, J. Comput. Chem..

[9]  Pufeng Du,et al.  Predicting multisite protein subcellular locations: progress and challenges , 2013, Expert review of proteomics.

[10]  Suyu Mei,et al.  Multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization. , 2012, Journal of theoretical biology.

[11]  Bing Wang,et al.  Prediction of peptide drift time in ion mobility mass spectrometry from sequence-based features , 2013, BMC Bioinformatics.

[12]  Xiaoqi Zheng,et al.  Prediction of bacterial protein subcellular localization by incorporating various features into Chou's PseAAC and a backward feature selection approach. , 2014, Biochimie.

[13]  K. Chou,et al.  Virus-mPLoc: A Fusion Classifier for Viral Protein Subcellular Location Prediction by Incorporating Multiple Sites , 2010, Journal of biomolecular structure & dynamics.