Prediction of protein structure classes with pseudo amino acid composition and fuzzy support vector machine network.

It is a critical challenge to develop automated methods for fast and accurately determining the structures of proteins because of the increasingly widening gap between the number of sequence-known proteins and that of structure-known proteins in the post-genomic age. The knowledge of protein structural class can provide useful information towards the determination of protein structure. Thus, it is highly desirable to develop computational methods for identifying the structural classes of newly found proteins based on their primary sequence. In this study, according to the concept of Chou's pseudo amino acid composition (PseAA), eight PseAA vectors are used to represent protein samples. Each of the PseAA vectors is a 40-D (dimensional) vector, which is constructed by the conventional amino acid composition (AA) and a series of sequence-order correlation factors as original introduced by Chou. The difference among the eight PseAA representations is that different physicochemical properties are used to incorporate the sequence-order effects for the protein samples. Based on such a framework, a dual-layer fuzzy support vector machine (FSVM) network is proposed to predict protein structural classes. In the first layer of the FSVM network, eight FSVM classifiers trained by different PseAA vectors are established. The 2nd layer FSVM classifier is applied to reclassify the outputs of the first layer. The results thus obtained are quite promising, indicating that the new method may become a useful tool for predicting not only the structural classification of proteins but also their other attributes.

[1]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[2]  Thomas Villmann,et al.  Proc. European Symposium on Artificial Neural Networks , 2007 .

[3]  H D Dakin,et al.  On Amino-acids. , 1918, The Biochemical journal.

[4]  M. Saraste,et al.  FEBS Lett , 2000 .

[5]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[6]  Jan Wessnitzer,et al.  ESANN'2007 proceedings - European Symposium on Artificial Neural Networks , 2007 .

[7]  Tim J. P. Hubbard,et al.  SCOP database in 2002: refinements accommodate structural genomics , 2002, Nucleic Acids Res..

[8]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[9]  T. Emery,et al.  Peptides , 1964, Peptides.

[10]  BMC Bioinformatics , 2005 .