Constraint Logic Programming approach to protein structure prediction

BackgroundThe protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems.ResultsConstraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known) secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics.ConclusionsThe results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.

[1]  Mihalis Yannakakis,et al.  On the complexity of protein folding (extended abstract) , 1998, STOC '98.

[2]  Flavio Seno,et al.  Assembly of protein tertiary structures from secondary structures using optimized potentials , 2003, Proteins.

[3]  Simon Levin Computational Molecular Biology An Introduction , 2000 .

[4]  William E. Hart,et al.  The Computational Complexity of Protein Structure Prediction in Simple Lattice Models , 2003 .

[5]  S. Spragg Biophysical chemistry , 1979, Nature.

[6]  A. Lesk,et al.  Determinants of a protein fold. Unique features of the globin amino acid sequences. , 1987, Journal of molecular biology.

[7]  Petra Mutzel,et al.  Computational Molecular Biology , 1996 .

[8]  R L Jernigan,et al.  Ideal architecture of residue packing and its observation in protein structures , 1997, Protein science : a publication of the Protein Society.

[9]  L Toma,et al.  Folding simulation of protein models on the structure‐based cubo‐octahedral lattice with the Contact Interactions algorithm , 1999, Protein science : a publication of the Protein Society.

[10]  Federico Fogolari,et al.  Amino acid empirical contact energy definitions for fold recognition in the space of contact maps , 2003, BMC Bioinformatics.

[11]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[12]  A. Kolinski,et al.  Simulations of the Folding of a Globular Protein , 1990, Science.

[13]  L. Mirny,et al.  Protein folding theory: from lattice to all-atom models. , 2001, Annual review of biophysics and biomolecular structure.

[14]  Rolf Backofen The Protein Structure Prediction Problem: A Constraint Optimization Approach using a New Lower Bound , 2004, Constraints.

[15]  Adam Godzik,et al.  Lattice representations of globular proteins: How good are they? , 1993, J. Comput. Chem..

[16]  Mihalis Yannakakis,et al.  On the Complexity of Protein Folding , 1998, J. Comput. Biol..

[17]  J. Skolnick,et al.  Reduced models of proteins and their applications , 2004 .

[18]  Yue,et al.  Sequence-structure relationships in proteins and copolymers. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[19]  R Powers,et al.  Validity of using the radius of gyration as a restraint in NMR protein structure determination. , 2001, Journal of the American Chemical Society.

[20]  W. C. Still,et al.  The GB/SA Continuum Model for Solvation. A Fast Analytical Method for the Calculation of Approximate Born Radii , 1997 .

[21]  Michael J. Maher,et al.  Constraint Logic Programming: A Survey , 1994, J. Log. Program..

[22]  T. Hubbard,et al.  Critical assessment of methods of protein structure prediction (CASP): Round III , 1999, Proteins.

[23]  J. Skolnick,et al.  TOUCHSTONE II: a new approach to ab initio protein structure prediction. , 2003, Biophysical journal.

[24]  Ceslovas Venclovas,et al.  Assessment of progress over the CASP experiments , 2003, Proteins.

[25]  R Samudrala,et al.  Ab initio construction of protein tertiary structures using a hierarchical approach. , 2000, Journal of molecular biology.

[26]  Alexander D. MacKerell,et al.  All-atom empirical potential for molecular modeling and dynamics studies of proteins. , 1998, The journal of physical chemistry. B.

[27]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[28]  B. Rost Review: protein secondary structure prediction continues to rise. , 2001, Journal of structural biology.

[29]  K. Dill Dominant forces in protein folding. , 1990, Biochemistry.

[30]  Richard Bonneau,et al.  Ab initio protein structure prediction of CASP III targets using ROSETTA , 1999, Proteins.

[31]  Leon Sterling,et al.  The Art of Prolog , 1987, IEEE Expert.

[32]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[33]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[34]  S. Bryant,et al.  Critical assessment of methods of protein structure prediction (CASP): Round II , 1997, Proteins.

[35]  F. Fogolari,et al.  Modeling of polypeptide chains as C alpha chains, C alpha chains with C beta, and C alpha chains with ellipsoidal lateral chains. , 1996, Biophysical journal.

[36]  Rolf Backofen,et al.  A Constraint-Based Approach to Structure Prediction for Simplified Protein Models That Outperforms Other Existing Methods , 2003, ICLP.

[37]  Alessandro Dal Palù,et al.  Protein Folding in CLP(FD) with Empirical Contact Energies , 2003, CSCLP.