A novel EDAs based method for HP model protein folding

The protein structure prediction (PSP) problem is one of the most important problems in computational biology. This paper proposes a novel Estimation of Distribution Algorithms (EDAs) based method to solve the PSP problem on HP model. Firstly, a composite fitness function containing the information of folding structure core formation is introduced to replace the traditional fitness function of HP model. It can help to select more optimum individuals for probabilistic model of EDAs algorithm. And a set of guided operators are used to increase the diversity of population and the likelihood of escaping from local optima. Secondly, an improved backtracking repairing algorithm is proposed to repair invalid individuals sampled by the probabilistic model of EDAs for the long sequence protein instances. A detection procedure of feasibility is added to avoid entering invalid closed areas when selecting directions for the residues. Thus, it can significant reduce the number of backtracking operation and the computational cost for long sequence protein. Experimental results demonstrate that the proposed method outperform the basic EDAs method. At the same time, it is very competitive with the other existing algorithms for the PSP problem on lattice HP models.

[1]  Mihalis Yannakakis,et al.  On the Complexity of Protein Folding , 1998, J. Comput. Biol..

[2]  Carlos Cotta,et al.  Protein Structure Prediction Using Evolutionary Algorithms Hybridized with Backtracking , 2009, IWANN.

[3]  Jim Smith,et al.  Study of fitness landscapes for the HP model of protein structure prediction , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[4]  Jiaxing Cheng,et al.  A Novel Genetic Algorithm for HP Model Protein Folding , 2005, Sixth International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT'05).

[5]  Pedro Larrañaga,et al.  Estimation of Distribution Algorithms , 2002, Genetic Algorithms and Evolutionary Computation.

[6]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[7]  Pedro Larrañaga,et al.  Protein Folding in Simplified Models With Estimation of Distribution Algorithms , 2008, IEEE Transactions on Evolutionary Computation.

[8]  Frank Thomson Leighton,et al.  Protein folding in the hydrophobic-hydrophilic (HP) is NP-complete , 1998, RECOMB '98.

[9]  Alexander Mendiburu,et al.  Parallel implementation of EDAs based on probabilistic graphical models , 2005, IEEE Transactions on Evolutionary Computation.

[10]  Pedro Larrañaga,et al.  Towards a New Evolutionary Computation - Advances in the Estimation of Distribution Algorithms , 2006, Towards a New Evolutionary Computation.

[11]  William E. Hart,et al.  Protein structure prediction with evolutionary algorithms , 1999 .

[12]  Pedro Larrañaga,et al.  Component weighting functions for adaptive search with EDAs , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[13]  William E. Hart,et al.  Fast protein folding in the hydrophobic-hydrophilic model within three-eights of optimal , 1995, STOC '95.

[14]  R Unger,et al.  Genetic algorithms for protein folding simulations. , 1992, Journal of molecular biology.

[15]  J. A. Lozano,et al.  Towards a New Evolutionary Computation: Advances on Estimation of Distribution Algorithms (Studies in Fuzziness and Soft Computing) , 2006 .

[16]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[17]  Garrison W. Greenwood,et al.  On the Evolutionary Search for Solutions to the Protein Folding Problem , 2003 .

[18]  Madhu Chetty,et al.  A Guided Genetic Algorithm for Protein Folding Prediction Using 3D Hydrophobic-Hydrophilic Model , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[19]  William E. Hart,et al.  Fast Protein Folding in the Hydrophobic-Hydrophillic Model within Three-Eights of Optimal , 1996, J. Comput. Biol..

[20]  Pedro Larrañaga,et al.  Protein Folding in 2-Dimensional Lattices with Estimation of Distribution Algorithms , 2004, ISBMDA.

[21]  K. Dill,et al.  A lattice statistical mechanics model of the conformational and sequence spaces of proteins , 1989 .