Protein fold classification with Grow-and-Learn network

Protein fold classification is an important subject in computational biology and a compelling work from the point of machine learning. To deal with such a challenging problem, in this study, we propose a solution method for the classification of protein folds using Grow-and-Learn (GAL) neural network together with one-versus-others (OvO) method. To classify the most common 27 protein folds, 125 dimensional data, constituted by the physicochemical properties of amino acids, are used. The study is conducted on a database including 694 proteins: 311 of these proteins are used for training and 383 of them for testing. Overall, the classification system achieves 81.2% fold recognition accuracy on the test set, where most of the proteins have less than 25% sequence identity with the ones used during the training. To portray the capabilities of the GAL network among the other methods, comparisons between a few approaches have also been made, and GAL’s accuracy is found to be higher than those of the existing methods for protein fold classification.

[1]  K. Suvarnavani,et al.  Multiclass Classification for Protein Fold Prediction Using Smote , 2012 .

[2]  Guido Bologna,et al.  A comparison study on protein fold recognition , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[3]  K. Chou A novel approach to predicting protein structural classes in a (20–1)‐D amino acid composition space , 1995, Proteins.

[4]  Chengqi Zhang,et al.  Margin-based ensemble classifier for protein fold recognition , 2011, Expert Syst. Appl..

[5]  Berrin A. Yanikoglu,et al.  Protein Structural Class Determination Using Support Vector Machines , 2004, ISCIS.

[6]  Kuo-Chen Chou,et al.  Ensemble classifier for protein fold pattern recognition , 2006, Bioinform..

[7]  I. Muchnik,et al.  Recognition of a protein fold in the context of the SCOP classification , 1999 .

[8]  K. Chou,et al.  An optimization approach to predicting protein structural class from amino acid composition , 1992, Protein science : a publication of the Protein Society.

[9]  Guo-Ping Zhou,et al.  An Intriguing Controversy over Protein Structural Class Prediction , 1998, Journal of protein chemistry.

[10]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[11]  Babak Nadjar Araabi,et al.  A protein fold classifier formed by fusing different modes of pseudo amino acid composition via PSSM , 2011, Comput. Biol. Chem..

[12]  X.-D. Sun,et al.  Prediction of protein structural classes using support vector machines , 2006, Amino Acids.

[13]  Infotech Oulu,et al.  Protein Fold Recognition with K-Local Hyperplane Distance Nearest Neighbor Algorithm , 2004 .

[14]  K. Chou,et al.  Predicting protein fold pattern with functional domain and sequential evolution information. , 2009, Journal of theoretical biology.

[15]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.

[16]  Katarzyna Stapor,et al.  A hybrid discriminative/generative approach to protein fold recognition , 2012, Neurocomputing.

[17]  Azadeh Shakery,et al.  Protein Fold Pattern Recognition Using Bayesian Ensemble of RBF Neural Networks , 2009, 2009 International Conference of Soft Computing and Pattern Recognition.

[18]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[19]  Lukasz A. Kurgan,et al.  PFRES: protein fold classification by using evolutionary information and predicted secondary structure , 2007, Bioinform..

[20]  C. Jutten,et al.  Gal: Networks That Grow When They Learn and Shrink When They Forget , 1991 .

[21]  Y Cai,et al.  Prediction of protein structural classes by neural network. , 2000, Biochimie.

[22]  G P Zhou,et al.  Some insights into protein structural class prediction , 2001, Proteins.

[23]  Nasrollah Moghadam Charkari,et al.  A two-layer classification framework for protein fold recognition. , 2015, Journal of theoretical biology.

[24]  I. Muchnik,et al.  Prediction of protein folding class using global description of amino acid sequence. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[25]  H. Scheraga,et al.  Experimental and theoretical aspects of protein folding. , 1975, Advances in protein chemistry.

[26]  Chris H. Q. Ding,et al.  Multi-class protein fold recognition using support vector machines and neural networks , 2001, Bioinform..

[27]  Tamer Ölmez,et al.  Classification of Respiratory Sounds by Using an Artificial Neural Network , 2003, Int. J. Pattern Recognit. Artif. Intell..

[28]  Babak Nadjar Araabi,et al.  Evidence theoretic protein fold classification based on the concept of hyperfold. , 2012, Mathematical biosciences.

[29]  Yu-Dong Cai,et al.  Is it a paradox or misinterpretation? , 2001, Proteins.

[30]  Loris Nanni A novel ensemble of classifiers for protein fold recognition , 2006, Neurocomputing.

[31]  Q. Zou,et al.  Hierarchical Classification of Protein Folds Using a Novel Ensemble Classifier , 2013, PloS one.

[32]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[33]  G M Maggiora,et al.  Domain structural class prediction. , 1998, Protein engineering.

[34]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.