Incorporating Knowledge of Secondary Structures in a L-System-Based Encoding for Protein Folding

An encoding scheme for protein folding on lattice models, inspired by parametric L-systems, was proposed. The encoding incorporates problem domain knowledge in the form of predesigned production rules that capture commonly known secondary structures: α-helices and β-sheets. The ability of this encoding to capture protein native conformations was tested using an evolutionary algorithm as the inference procedure for discovering L-systems. Results confirmed the suitability of the proposed representation. It appears that the occurrence of motifs and sub-structures is an important component in protein folding, and these sub-structures may be captured by a grammar-based encoding. This line of research suggests novel and compact encoding schemes for protein folding that may have practical implications in solving meaningful problems in biotechnology such as structure prediction and protein folding.

[1]  W. Wong,et al.  Evolutionary Monte Carlo for protein folding simulations , 2001 .

[2]  Peter J. Bentley,et al.  Exploring Component-based Representations - The Secret of Creativity by Evolution? , 2000 .

[3]  Gabriela Ochoa,et al.  Assortative Mating in Genetic Algorithms for Dynamic Problems , 2005, EvoWorkshops.

[4]  P Argos,et al.  Folding the main chain of small proteins with the genetic algorithm. , 1994, Journal of molecular biology.

[5]  Gabriela Ochoa,et al.  Evolving L-Systems to Capture Protein Structure Native Conformations , 2005, EuroGP.

[6]  Natalio Krasnogor,et al.  Studies on the theory and design space of memetic algorithms , 2002 .

[7]  Edmund K. Burke,et al.  Multimeme Algorithms for Protein Structure Prediction , 2002, PPSN.

[8]  Burak Erman,et al.  Minimum Energy Configurations of the 2-Dimensional HP-Model of Proteins by Self-Organizing Networks , 2002, J. Comput. Biol..

[9]  Natalio Krasnogor,et al.  MAFRA: A java memetic algorithms framework , 2000 .

[10]  Alvy Ray Smith,et al.  Plants, fractals, and formal languages , 1984, SIGGRAPH.

[11]  Natalio Krasnogor,et al.  The Local Searcher as a Supplier of Building Blocks in Self-generating Memetic Algorithms , 2003 .

[12]  Sue Whitesides,et al.  A complete and effective move set for simplified protein folding , 2003, RECOMB '03.

[13]  K. Dill Theory for the folding and stability of globular proteins. , 1985, Biochemistry.

[14]  Przemyslaw Prusinkiewicz,et al.  The Algorithmic Beauty of Plants , 1990, The Virtual Laboratory.

[15]  R Unger,et al.  Genetic algorithms for protein folding simulations. , 1992, Journal of molecular biology.

[16]  Marc Schoenauer,et al.  Shape Representations and Evolution Schemes , 1996, Evolutionary Programming.

[17]  Gabriella Kókai,et al.  Modelling Blood Vessels of the Eye with Parametric L-Systems Using Evolutionary Algorithms , 1999, AIMDM.

[18]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[19]  Gregory S. Hornby,et al.  The advantages of generative grammatical encodings for physical design , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[20]  Jacques-André Landry,et al.  Generating grammatical plant models with genetic algorithms , 2005 .