A novel ab-initio genetic-based approach for protein folding prediction

In this paper, a model based on genetic algorithms for protein folding prediction is proposed. The most important features of the proposed approach are: i) Heuristic secondary structure information is used in the initialization of the genetic algorithm; ii) An enhanced 3D spatial representation called cube-octahedron is used, also, an expansion technique is proposed in order to reduce the computational complexity and spatial constraints; iii) Data preprocessing of geometric features to characterize the cube-octahedron using twelve basic vectors to define the nodes. Additionally, biological information (torsion angles, bond angles and secondary structure conformations) was pre-processed through an analysis of all possible combinations of the basic vectors which satisfy the biological constrains defined by the spatial representation; and iv) Hashing techniques were used to improve the computational efficiency. The pre-processed information was stored in hash tables, which are intensively used by the genetic algorithm. Some experiments were carried out to validate the proposed model obtaining very promising results.

[1]  R. Maya CRITICAL ASSESSMENT OF TECHNIQUES FOR PROTEIN STRUCTURE PREDICTION , 2014 .

[2]  H A Scheraga,et al.  Improved genetic algorithm for the protein folding problem by use of a Cartesian combination operator , 1996, Protein science : a publication of the Protein Society.

[3]  Christodoulos A. Floudas,et al.  Advances in protein structure prediction and de novo protein design : A review , 2006 .

[4]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[5]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[6]  Federico Fogolari,et al.  Amino acid empirical contact energy definitions for fold recognition in the space of contact maps , 2003, BMC Bioinformatics.

[7]  Andras Fiser,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[8]  Jay Kappraff,et al.  Connections: The Geometric Bridge Between Art and Science , 1990 .

[9]  J M Chandonia,et al.  Neural networks for secondary structure and structural class predictions , 1995, Protein science : a publication of the Protein Society.

[10]  L. Darrell Whitley,et al.  An overview of evolutionary algorithms: practical issues and common pitfalls , 2001, Inf. Softw. Technol..

[11]  R Unger,et al.  Genetic algorithms for protein folding simulations. , 1992, Journal of molecular biology.

[12]  M. Karplus,et al.  Kinetics of protein folding. A lattice model study of the requirements for folding to the native state. , 1994, Journal of molecular biology.

[13]  R L Jernigan,et al.  Ideal architecture of residue packing and its observation in protein structures , 1997, Protein science : a publication of the Protein Society.

[14]  R. Murray...,et al.  Harper's Biochemistry , 1993 .

[15]  Ajay K. Royyuru,et al.  Blue Gene: A vision for protein science using a petaflop supercomputer , 2001, IBM Syst. J..