Prediction of inter-residue contacts map based on genetic algorithm optimized radial basis function neural network and binary input encoding scheme

SummaryInter-residue contacts map prediction is one of the most important intermediate steps to the protein folding problem. In this paper, we focus on the problem of protein inter-residue contacts map prediction based on neural network technique. Firstly, we use a genetic algorithm (GA) to optimize the radial basis function widths and hidden centers of a radial basis function neural network (RBFNN), then a novel binary encoding scheme is employed to train the network for the purpose of learning and predicting the inter-residue contacts patterns of protein sequences got from the protein data bank (PDB). The experimental evidence indicates the utility of our proposed encoding strategy and GA optimized RBFNN. Moreover, the simulation results demonstrate that the network got a better performance for these proteins, whose residue length falls into the area of (100, 300), and the predicted accuracy with a contact threshold of 7 Å scores higher than the other 3 values with 5, 6, and 8 Å .

[1]  A. Ketterman,et al.  Single amino acid changes outside the active site significantly affect activity of glutathione S-transferases. , 2001, Insect biochemistry and molecular biology.

[2]  Srikanta Sen,et al.  Statistical analysis of pair-wise compatibility of spatially nearest neighbor and adjacent residues in alpha-helix and beta-strands: application to a minimal model for secondary structure prediction. , 2003, Biophysical chemistry.

[3]  G. Vriend,et al.  Prediction of protein residue contacts with a PDB-derived likelihood matrix. , 2002, Protein engineering.

[4]  De-Shuang Huang,et al.  Linear and Nonlinear Feedforward Neural Network Classifiers: A Comprehensive Understanding , 1999 .

[5]  Kenneth DeJong,et al.  Learning with genetic algorithms: An overview , 1988, Machine Learning.

[6]  Eytan Domany,et al.  Protein fold recognition and dynamics in the space of contact maps , 1996, Proteins.

[7]  M. N. Ponnuswamy,et al.  Structural class prediction: an application of residue distribution along the sequence. , 2000, Biophysical chemistry.

[8]  Kyou-Hoon Han,et al.  Solution Conformation of α-Conotoxin EI, a Neuromuscular Toxin Specific for the α1/δ Subunit Interface of Torpedo Nicotinic Acetylcholine Receptor* , 2001, The Journal of Biological Chemistry.

[9]  De-Shuang Huang Application of Generalized Radial Basis Function Networks to Recognition of Radar Targets , 1999, Int. J. Pattern Recognit. Artif. Intell..

[10]  Linxi Zhang,et al.  Folding rate prediction based on neural network model , 2003 .

[11]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[12]  K. Burrage,et al.  Protein contact prediction using patterns of correlation , 2004, Proteins.

[13]  M Vendruscolo,et al.  Toward an energy function for the contact map representation of proteins , 2000, Proteins.

[14]  R. Casadio,et al.  A neural network based predictor of residue contacts in proteins. , 1999, Protein engineering.

[15]  G Benedetti,et al.  A genetic algorithm to search for optimal and suboptimal RNA secondary structures. , 1995, Biophysical chemistry.

[16]  Lin Guo,et al.  Combining genetic optimisation with hybrid learning algorithm for radial basis function neural networks , 2003 .

[17]  J C Wootton,et al.  Dynamic contact maps of protein structures. , 1998, Journal of molecular graphics & modelling.

[18]  Haibin Yu,et al.  Neural network and genetic algorithm-based hybrid approach to expanded job-shop scheduling , 2001 .

[19]  Hu Chen,et al.  A novel method for protein secondary structure prediction using dual‐layer SVM and profiles , 2004, Proteins.

[20]  Tao Zhu,et al.  An efficient learning algorithm for improving generalization performance of radial basis function neural networks , 2000, Neural Networks.

[21]  K. De Jong Learning with Genetic Algorithms: An Overview , 1988 .

[22]  M Vendruscolo,et al.  Can a pairwise contact potential stabilize native protein folds against decoys obtained by threading? , 2000, Proteins.

[23]  P Fariselli,et al.  Prediction of contact maps with neural networks and correlated mutations. , 2001, Protein engineering.

[24]  P Fariselli,et al.  Progress in predicting inter‐residue contacts of proteins with neural networks and correlated mutations , 2001, Proteins.

[25]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[26]  Sung-Bae Cho,et al.  Pattern recognition with neural networks combined by genetic algorithm , 1999, Fuzzy Sets Syst..

[27]  Mohammed J. Zaki,et al.  Mining Protein Contact Maps , 2002, BIOKDD.

[28]  James E. Bray,et al.  Assigning genomic sequences to CATH , 2000, Nucleic Acids Res..

[29]  M. Gromiha,et al.  Comparison between long-range interactions and contact order in determining the folding rate of two-state proteins: application of long-range order to folding rate prediction. , 2001, Journal of molecular biology.