Using improved K-nearest neighbor method to identify anti-and pro-apoptosis proteins

Since the apoptosis protein plays an important role in understanding the mechanism of programmed cell death, so further to reveal the mechanism of subclass for apoptosis can bring more insights to their function. Here, our group established a dataset included 239 anti-apoptosis proteins and 222 pro-apoptosis proteins in our previous work. The extraction of information based on sequence information, gene ontology information and evolution information. Finally we proposed a mean value k-nearest neighbor (MKNN) algorithm. The results of MKNN indicated that the decision-making method of mean value is distinctly superior to the traditional decision-making method of majority vote. Meanwhile, we also listed the result of support vector machine (SVM) and k-nearest neighbor (KNN) to compare with our method. Then jackknife tests show that improved method is robust, useful and reliable for predicting the subcellular location of protein.

[1]  K. Chou A novel approach to predicting protein structural classes in a (20–1)‐D amino acid composition space , 1995, Proteins.

[2]  X.-B. Zhou,et al.  Improved prediction of subcellular location for apoptosis proteins by the dual-layer support vector machine , 2008, Amino Acids.

[3]  Wei Chen,et al.  A similarity distance of diversity measure for discriminating mesophilic and thermophilic proteins , 2013, Amino Acids.

[4]  Yongsheng Ding,et al.  Using Chou's pseudo amino acid composition to predict subcellular localization of apoptosis proteins: An approach with immune genetic algorithm-based ensemble classifier , 2008, Pattern Recognit. Lett..

[5]  Xiaoqi Zheng,et al.  Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile. , 2010, Biochimie.

[6]  Feng Ye,et al.  Using principal component analysis and support vector machine to predict protein structural class for low-similarity sequences via PSSM , 2012, Journal of biomolecular structure & dynamics.

[7]  K. Chou,et al.  Euk-mPLoc: a fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites. , 2007, Journal of proteome research.

[8]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[9]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[10]  Ying-Li Chen,et al.  Prediction of apoptosis protein subcellular location using improved hybrid approach and pseudo-amino acid composition. , 2007, Journal of theoretical biology.

[11]  Kuo-Chen Chou,et al.  Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes , 2005, Bioinform..

[12]  K. Chou,et al.  iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites. , 2012, Molecular bioSystems.

[13]  Y Cai,et al.  Prediction of protein structural classes by neural network. , 2000, Biochimie.

[14]  Shao-Wu Zhang,et al.  MSLoc-DT: a new method for predicting the protein subcellular location of multispecies based on decision templates. , 2014, Analytical biochemistry.

[15]  Yan Li,et al.  A protein structural classes prediction method based on predicted secondary structure and PSI-BLAST profile. , 2014, Biochimie.

[16]  H. Steller Mechanisms and genes of cellular suicide , 1995, Science.

[17]  Zhirong Sun,et al.  Support vector machine approach for protein subcellular localization prediction , 2001, Bioinform..

[18]  Xiaoyong Zou,et al.  Using pseudo-amino acid composition and support vector machine to predict protein structural class. , 2006, Journal of theoretical biology.

[19]  Kuo-Chen Chou,et al.  Prediction of Protein Structural Classes by Support Vector Machines , 2002, Comput. Chem..

[20]  Paul Horton,et al.  Nucleic Acids Research Advance Access published May 21, 2007 WoLF PSORT: protein localization predictor , 2007 .

[21]  K. Chou,et al.  Recent progress in protein subcellular location prediction. , 2007, Analytical biochemistry.

[22]  Ying-Li Chen,et al.  Prediction of the subcellular location of apoptosis proteins. , 2007, Journal of theoretical biology.

[23]  Xiaoqi Zheng,et al.  Prediction of bacterial protein subcellular localization by incorporating various features into Chou's PseAAC and a backward feature selection approach. , 2014, Biochimie.

[24]  Kuo-Chen Chou,et al.  Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-Nearest Neighbor classifiers. , 2006, Journal of proteome research.

[25]  T. Hubbard,et al.  Using neural networks for prediction of the subcellular location of proteins. , 1998, Nucleic acids research.

[26]  Yu-Dong Cai,et al.  Support Vector Machines for predicting protein structural class , 2001, BMC Bioinformatics.