A novel representation for apoptosis protein subcellular localization prediction using support vector machine.

Apoptosis, or programmed cell death, plays an important role in development of an organism. Obtaining information on subcellular location of apoptosis proteins is very helpful to understand the apoptosis mechanism. In this paper, based on the concept that the position distribution information of amino acids is closely related with the structure and function of proteins, we introduce the concept of distance frequency [Matsuda, S., Vert, J.P., Ueda, N., Toh, H., Akutsu, T., 2005. A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci. 14, 2804-2813] and propose a novel way to calculate distance frequencies. In order to calculate the local features, each protein sequence is separated into p parts with the same length in our paper. Then we use the novel representation of protein sequences and adopt support vector machine to predict subcellular location. The overall prediction accuracy is significantly improved by jackknife test.

[1]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[2]  Josef Pánek,et al.  A new method for identification of protein (sub)families in a set of proteins based on hydropathy distribution in proteins , 2005, Proteins.

[3]  Kuo-Chen Chou,et al.  Large‐scale plant protein subcellular location prediction , 2007, Journal of cellular biochemistry.

[4]  Kuo-Chen Chou,et al.  Large-scale predictions of gram-negative bacterial protein subcellular locations. , 2006, Journal of proteome research.

[5]  J C Reed,et al.  Postmitochondrial regulation of apoptosis during heart failure. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Ying-Li Chen,et al.  Prediction of apoptosis protein subcellular location using improved hybrid approach and pseudo-amino acid composition. , 2007, Journal of theoretical biology.

[7]  J B Schulz,et al.  Caspases as treatment targets in stroke and neurodegenerative diseases , 1999, Annals of neurology.

[8]  Jing Huang,et al.  Support Vector Machines for Predicting Apoptosis Proteins Types , 2005, Acta biotheoretica.

[9]  Ying-Li Chen,et al.  Prediction of the subcellular location of apoptosis proteins. , 2007, Journal of theoretical biology.

[10]  Yang Dai,et al.  An SVM-based system for predicting protein subnuclear localizations , 2005, BMC Bioinformatics.

[11]  Martin Raff,et al.  Cell suicide for beginners , 1998, Nature.

[12]  K. Chou,et al.  Gpos-PLoc: an ensemble classifier for predicting subcellular localization of Gram-positive bacterial proteins. , 2007, Protein engineering, design & selection : PEDS.

[13]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[14]  Minoru Kanehisa,et al.  Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs , 2003, Bioinform..

[15]  S. Cory,et al.  The Bcl-2 protein family: arbiters of cell survival. , 1998, Science.

[16]  Zhirong Sun,et al.  Support vector machine approach for protein subcellular localization prediction , 2001, Bioinform..

[17]  K. Chou,et al.  Protein subcellular location prediction. , 1999, Protein engineering.

[18]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[19]  Nico Tjandra,et al.  Structure of Bax Coregulation of Dimer Formation and Intracellular Localization , 2000, Cell.

[20]  D. Goldenberg,et al.  Finding the right fold , 1999, Nature Structural Biology.

[21]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[22]  Q. Pan,et al.  Using pseudo amino acid composition to predict protein subcellular location: approached with amino acid composition distribution , 2008, Amino Acids.

[23]  G. Evan,et al.  A matter of life and cell death. , 1998, Science.

[24]  M. Raff,et al.  Programmed Cell Death in Animal Development , 1997, Cell.

[25]  Zhen-Hui Zhang,et al.  A novel method for apoptosis protein subcellular localization prediction combining encoding based on grouped weight and support vector machine , 2006, FEBS letters.

[26]  Gajendra P. S. Raghava,et al.  ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST , 2004, Nucleic Acids Res..

[27]  Guo-Ping Zhou,et al.  Subcellular location prediction of apoptosis proteins , 2002, Proteins.

[28]  Roland Eils,et al.  Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains , 2006, BMC Bioinformatics.

[29]  K. Chou,et al.  Support vector machines for prediction of protein subcellular location by incorporating quasi‐sequence‐order effect , 2002, Journal of cellular biochemistry.

[30]  Manoj Bhasin,et al.  SVM-based method for subcellular localization of human proteins using amino acid compositions , their order and similarity search , 2005 .

[31]  Kuo-Chen Chou,et al.  Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-Nearest Neighbor classifiers. , 2006, Journal of proteome research.

[32]  K. Chou,et al.  Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization. , 2006, Biochemical and biophysical research communications.

[33]  M. Bhasin,et al.  Support Vector Machine-based Method for Subcellular Localization of Human Proteins Using Amino Acid Compositions, Their Order, and Similarity Search* , 2005, Journal of Biological Chemistry.

[34]  Kuo-Chen Chou,et al.  Predicting protein subcellular location by fusing multiple classifiers , 2006, Journal of cellular biochemistry.

[35]  Chun Yan,et al.  Prediction of protein subcellular location using a combined feature of sequence , 2005, FEBS letters.

[36]  Jean-Philippe Vert,et al.  A novel representation of protein sequences for prediction of subcellular location using support vector machines , 2005, Protein science : a publication of the Protein Society.

[37]  C. A. Andersen,et al.  Prediction of human protein function from post-translational modifications and localization features. , 2002, Journal of molecular biology.

[38]  K Nishikawa,et al.  Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies. , 1994, Journal of molecular biology.

[39]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[40]  Zhanchao Li,et al.  Using Chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. , 2007, Journal of theoretical biology.