Protein Subcellular Location Prediction Based on Pseudo Amino Acid Composition and Immune Genetic Algorithm

Protein subcellular location prediction with computational method is still a hot spot in bioinformatics. In this paper, we present a new method to predict protein subcellular location, which based on pseudo amino acid composition and immune genetic algorithm. Hydrophobic patterns of amino acid couples and approximate entropy are introduced to construct pseudo amino acid composition. Immune Genetic algorithm (IGA) is applied to find the fittest weight factors for pseudo amino acid composition, which are crucial in this method. As such, high success rates are obtained by both self-consistency test and jackknife test. More than 80% predictive accuracy is achieved in independent dataset test. The result demonstrates that this new method is practical. And, the method illuminates that the hydrophobic patterns of protein sequence influence its subcellular location.

[1]  Michael G Sadovsky,et al.  The method to compare nucleotide sequences based on the minimum entropy principle , 2003, Bulletin of mathematical biology.

[2]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[3]  G. Rose,et al.  Hydrophobicity of amino acid residues in globular proteins. , 1985, Science.

[4]  M. Schiffer,et al.  Use of helical wheels to represent the structures of proteins and to identify segments with helical potential. , 1967, Biophysical journal.

[5]  T. Hubbard,et al.  Using neural networks for prediction of the subcellular location of proteins. , 1998, Nucleic acids research.

[6]  K. Chou,et al.  Support vector machines for prediction of protein subcellular location by incorporating quasi‐sequence‐order effect , 2002, Journal of cellular biochemistry.

[7]  M. Kanehisa,et al.  A knowledge base for predicting protein localization sites in eukaryotic cells , 1992, Genomics.

[8]  Guo-Ping Zhou,et al.  Subcellular location prediction of apoptosis proteins , 2002, Proteins.

[9]  K. Dill Dominant forces in protein folding. , 1990, Biochemistry.

[10]  Minoru Kanehisa,et al.  Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs , 2003, Bioinform..

[11]  K. Chou,et al.  Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular Location* , 2002, The Journal of Biological Chemistry.

[12]  Z. Huang,et al.  Using complexity measure factor to predict protein subcellular location , 2005, Amino Acids.

[13]  M. Wang,et al.  Weighted-support vector machines for predicting membrane protein types based on pseudo-amino acid composition. , 2004, Protein engineering, design & selection : PEDS.

[14]  Guo-Ping Zhou,et al.  An Intriguing Controversy over Protein Structural Class Prediction , 1998, Journal of protein chemistry.

[15]  Kuo-Chen Chou,et al.  Nearest neighbour algorithm for predicting protein subcellular location by combining functional domain composition and pseudo-amino acid composition. , 2003, Biochemical and biophysical research communications.

[16]  Zheng Yuan Prediction of protein subcellular locations using Markov chain models , 1999, FEBS letters.

[17]  Zhirong Sun,et al.  Support vector machine approach for protein subcellular localization prediction , 2001, Bioinform..

[18]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[19]  V. Lim Algorithms for prediction of α-helical and β-structural regions in globular proteins , 1974 .

[20]  P. Aloy,et al.  Relation between amino acid composition and cellular location of proteins. , 1997, Journal of molecular biology.

[21]  G P Zhou,et al.  Some insights into protein structural class prediction , 2001, Proteins.

[22]  Lin He,et al.  Application of Pseudo Amino Acid Composition for Predicting Protein Subcellular Location: Stochastic Signal Processing Approach , 2003, Journal of protein chemistry.

[23]  K. Chou,et al.  Protein subcellular location prediction. , 1999, Protein engineering.

[24]  S M Pincus,et al.  Approximate entropy as a measure of system complexity. , 1991, Proceedings of the National Academy of Sciences of the United States of America.