Protein side-chain placement: probabilistic inference and integer programming methods

The prediction of energetically favorable sidechain conformations is a fundamental element in homology modeling of proteins and the design of novel protein sequences. The space of side-chain conformations can be approximated by a discrete space of probabilistically representative side-chain conformations (called rotamers). The problem is, then, to find a rotamer selection for each amino acid that minimizes a potential energy function. This is called the Global Minimum Energy Conformation (GMEC) problem. This problem is an NP -hard optimization problem. The Dead-End Elimination theorem together with the A∗ algorithm (DEE/A∗) has been successfully applied to this problem. However, DEE fails to converge for some complex instances. In this paper, we explore two alternatives to DEE/A∗ in solving the GMEC problem. We use a probabilistic inference method, the max-product (MP) belief-propagation algorithm, to estimate (often exactly) the GMEC. We also investigate integer programming formulations to obtain the exact solution. There are known ILP formulations that can be directly applied to the GMEC problem. We review these formulations and compare their effectiveness using CPLEX optimizers. We also present preliminary work towards applying the branch-and-price approach to the GMEC problem. The preliminary results suggest that the max-product algorithm is very effective for the GMEC problem. Though the max-product algorithm is an approximate method, its speed and accuracy are comparable to those of DEE/A∗ in large side-chain placement problems and may be superior

[1]  P. Koehl,et al.  Application of a self-consistent mean field theory to predict protein side-chains conformation and estimate their conformational entropy. , 1994, Journal of molecular biology.

[2]  D. Benjamin Gordon,et al.  Radical performance enhancements for combinatorial optimization algorithms based on the dead-end elimination theorem , 1998, Journal of Computational Chemistry.

[3]  Arie M. C. A. Koster,et al.  Solving frequency assignment problems via tree-decomposition , 1999 .

[4]  Stephen L. Mayo,et al.  Conformational splitting: A more powerful criterion for dead-end elimination , 2000, J. Comput. Chem..

[5]  Pedro Larrañaga,et al.  An Introduction to Probabilistic Graphical Models , 2002, Estimation of Distribution Algorithms.

[6]  I Lasters,et al.  All in one: a highly detailed rotamer library improves both accuracy and speed in the modelling of sidechains by dead-end elimination. , 1997, Folding & design.

[7]  Martin W. P. Savelsbergh,et al.  Branch-and-Price: Column Generation for Solving Huge Integer Programs , 1998, Oper. Res..

[8]  S. L. Mayo,et al.  Protein design automation , 1996, Protein science : a publication of the Protein Society.

[9]  D B Gordon,et al.  Branch-and-terminate: a combinatorial optimization algorithm for protein design. , 1999, Structure.

[10]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.

[11]  Yair Weiss,et al.  Approximate Inference and Protein-Folding , 2002, NIPS.

[12]  W. Freeman,et al.  Bethe free energy, Kikuchi approximations, and belief propagation algorithms , 2001 .

[13]  Arie M. C. A. Koster,et al.  Frequency assignment : models and algorithms , 1999 .

[14]  Ulrich Faigle,et al.  A Lagrangian relaxation approach to the edge-weighted clique problem , 2001, Eur. J. Oper. Res..

[15]  Adrian A Canutescu,et al.  A graph‐theory algorithm for rapid protein side‐chain prediction , 2003, Protein science : a publication of the Protein Society.

[16]  Arne Elofsson,et al.  Side Chain-Positioning as an Integer Programming Problem , 2001, WABI.

[17]  J. Ponder,et al.  Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. , 1987, Journal of molecular biology.

[18]  Cid C. de Souza,et al.  The edge-weighted clique problem: Valid inequalities, facets and polyhedral computations , 2000, Eur. J. Oper. Res..

[19]  Arie M. C. A. Koster,et al.  Lower bounds for minimum interference frequency assignment probems , 2000 .

[20]  Arie M. C. A. Koster,et al.  The partial constraint satisfaction problem: Facets and lifting theorems , 1998, Oper. Res. Lett..

[21]  A R Leach,et al.  Exploring the conformational space of protein side chains using dead‐end elimination and the A* algorithm , 1998, Proteins.

[22]  Ernst Althaus,et al.  A combinatorial approach to protein docking with flexible side-chains , 2000, RECOMB '00.