An evolutionary approach for protein classification using feature extraction by artificial neural network

A protein superfamily consists of proteins which share amino acid sequence homology and are therefore functionally and structurally related. Generally, two proteins are classified into the same class if they have most of the features extracted in common. As the size of the protein databases are becoming larger in size, it is better to develop an intelligent system to classify the protein with high accuracy. Artificial neural networks have been successfully applied to problems in pattern classification, function approximation, optimization, and associative memories. Multilayer feedforward networks are trained using the backpropagation (BP) learning algorithm but they are limited to searching for a suitable set of weights in an a priori fixed network topology. This mandates the selection of an appropriate optimized synaptic weight for the learning problem on hand. Genetic Algorithm (GA) is a stochastic based global searching technique which may be used to find out the optimized synaptic weight. Thus, a hybrid method combining GA-BP is implemented in this paper. Due to the limitations of GA such as premature convergence, low local convergence speed etc. an improvement to the GA is done. The Adaptive Genetic Algorithm (AGA-BP) uses an adaptive updating mechanism of the crossover and mutation probability which gives better result in comparison to GA-BP and traditional BP in terms of speed, predictive accuracy, and precision of convergence.

[1]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[2]  Cathy H. Wu,et al.  Neural networks for full-scale protein sequence classification: Sequence encoding with singular value decomposition , 1995, Machine Learning.

[3]  Tharam S. Dillon,et al.  Protein Sequences Classification Using Modular RBF Neural Networks , 2002, Australian Joint Conference on Artificial Intelligence.

[4]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[5]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[6]  L. Darrell Whitley,et al.  The GENITOR Algorithm and Selection Pressure: Why Rank-Based Allocation of Reproductive Trials is Best , 1989, ICGA.

[7]  Fakhreddine O. Karray,et al.  Soft Computing and Intelligent Systems Design, Theory, Tools and Applications , 2006, IEEE Transactions on Neural Networks.

[8]  Lalit M. Patnaik,et al.  Adaptive probabilities of crossover and mutation in genetic algorithms , 1994, IEEE Trans. Syst. Man Cybern..

[9]  Inge Jonassen,et al.  Methods for Finding Motifs in Sets of Related Biosequences , 1996 .

[10]  Cathy H. Wu,et al.  Neural Networks for Molecular Sequence Classification , 1993, ISMB.

[11]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[12]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[13]  Satish Kumar Jain,et al.  Neural networks : a classroom approach , 2005 .

[14]  Dianhui Wang,et al.  Protein sequence classification using extreme learning machine , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..