Pruning neural networks for protein secondary structure prediction

Secondary structure prediction is an effective approach in deducing the three dimensional structure and functions of proteins. Although the multilayer neural network is currently used for the prediction, appropriate determination of the network size is yet an important factor in improving the performance of the network. In this work, two systematic approaches for pruning the oversized multilayer perceptron neural networks (MLP-NN) are proposed to determine the optimum size of the hidden layer. Using the RS126 dataset in seven-fold cross-validation, the percentage accuracy of the prediction reaches to 75.38.

[1]  G. Aloisio,et al.  A Grid-Enabled Protein Secondary Structure Predictor , 2007, IEEE Transactions on NanoBioscience.

[2]  Pierre Baldi,et al.  Bioinformatics - the machine learning approach (2. ed.) , 2000 .

[3]  Pierre Baldi,et al.  Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles , 2002, Proteins.

[4]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[5]  Christian Cole,et al.  The Jpred 3 secondary structure prediction server , 2008, Nucleic Acids Res..

[6]  Philip T. Quinlan,et al.  Structural change and development in real and artificial neural networks , 1998, Neural Networks.

[7]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[8]  De-Shuang Huang,et al.  Improving protein secondary structure prediction by using the residue conformational classes , 2005, Pattern Recognit. Lett..

[9]  Guimar Combining Few Neural Networks for Effective Secondary Structure Prediction , 2003 .

[10]  L E White,et al.  Is neural development Darwinian? , 1996, Trends in neurosciences.

[11]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[12]  Kuo-Chen Chou,et al.  Ensemble classifier for protein fold pattern recognition , 2006, Bioinform..

[13]  Lukasz A. Kurgan,et al.  Highly accurate and consistent method for prediction of helix and strand content from primary protein sequences , 2005, Artif. Intell. Medicine.

[14]  Juan Cui,et al.  Recent progresses in the application of machine learning approach for predicting protein functional class independent of sequence similarity , 2006, Proteomics.

[15]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[16]  Riccardo Rosati,et al.  Minimal Belief and Negation as Failure in Multi-Agent Systems , 2004, Annals of Mathematics and Artificial Intelligence.

[17]  Zafer Aydin,et al.  A signal processing application in genomic research: protein secondary structure prediction , 2006 .

[18]  Narendra S. Chaudhari,et al.  Cascaded Bidirectional Recurrent Neural Networks for Protein Secondary Structure Prediction , 2007, IEEE ACM Trans. Comput. Biol. Bioinform..

[19]  Giovanni Colonna,et al.  Amino acid propensities for secondary structures are influenced by the protein structural class. , 2006, Biochemical and biophysical research communications.

[20]  Alessio Ceroni,et al.  Learning protein secondary structure from sequential and relational data , 2005, Neural Networks.

[21]  Jocelyn Sietsma,et al.  Creating artificial neural networks that generalize , 1991, Neural Networks.

[22]  Narendra S. Chaudhari,et al.  Bidirectional segmented-memory recurrent neural network for protein secondary structure prediction , 2006, Soft Comput..

[23]  G J Barton,et al.  Evaluation and improvement of multiple sequence methods for protein secondary structure prediction , 1999, Proteins.

[24]  Hava Siegelmann,et al.  Application of expert networks for predicting proteins secondary structure. , 2007, Biomolecular engineering.

[25]  Pierre Baldi,et al.  Three-stage prediction of protein ?-sheets by neural networks, alignments and graph algorithms , 2005, ISMB.