An exact algorithm for side-chain placement in protein design

Computational protein design aims at constructing novel or improved functions on the structure of a given protein backbone and has important applications in the pharmaceutical and biotechnical industry. The underlying combinatorial side-chain placement (SCP) problem consists of choosing a SCP for each residue position such that the resulting overall energy is minimum. The choice of the side-chain then also determines the amino acid for this position. Many algorithms for this $${\mathcal{NP}}$$-hard problem have been proposed in the context of homology modeling, which, however, reach their limits when faced with large protein design instances. In this paper, we propose a new exact method for the SCP problem that works well even for large instance sizes as they appear in protein design. Our main contribution is a dedicated branch-and-bound algorithm that combines tight upper and lower bounds resulting from a novel Lagrangian relaxation approach for SCP. Our experimental results show that our method outperforms alternative state-of-the-art exact approaches and makes it possible to optimally solve large protein design instances routinely.

[1]  S J Wodak,et al.  Automatic protein design with all atom force-fields by exact and heuristic optimization. , 2000, Journal of molecular biology.

[2]  Sabine C. Mueller,et al.  BALL - biochemical algorithms library 1.3 , 2010, BMC Bioinformatics.

[3]  Niles A Pierce,et al.  Protein design is NP-hard. , 2002, Protein engineering.

[4]  R. Goldstein Efficient rotamer elimination applied to protein side-chains and related spin glasses. , 1994, Biophysical journal.

[5]  Ernst Althaus,et al.  A combinatorial approach to protein docking with flexible side-chains , 2000, RECOMB '00.

[6]  D. Baker,et al.  High-resolution Structural and Thermodynamic Analysis of Extreme Stabilization of Human Procarboxypeptidase by Computational Protein Design , 2007, Journal of molecular biology.

[7]  Kurt Mehlhorn,et al.  The LEDA Platform of Combinatorial and Geometric Computing , 1997, ICALP.

[8]  Hans-Peter Lenhof,et al.  BALL: Biochemical Algorithms Library , 1999, Algorithm Engineering.

[9]  D. Baker,et al.  Native protein sequences are close to optimal for their structures. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[10]  F. Arnold,et al.  Evolving strategies for enzyme engineering. , 2005, Current opinion in structural biology.

[11]  Yair Weiss,et al.  Linear Programming Relaxations and Belief Propagation - An Empirical Study , 2006, J. Mach. Learn. Res..

[12]  Andrew P. Sage,et al.  Uncertainty in Artificial Intelligence , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[13]  Bonnie Berger,et al.  Fast and accurate algorithms for protein side-chain packing , 2006, JACM.

[14]  Mona Singh,et al.  A Semidefinite Programming Approach to Side Chain Positioning with New Rounding Strategies , 2004, INFORMS J. Comput..

[15]  Roland L. Dunbrack Rotamer libraries in the 21st century. , 2002, Current opinion in structural biology.

[16]  Wei Xie,et al.  Residue-rotamer-reduction algorithm for the protein side-chain conformation problem , 2006, Bioinform..

[17]  Adrian A Canutescu,et al.  Access the most recent version at doi: 10.1110/ps.03154503 References , 2003 .

[18]  Barry Honig,et al.  Extending the accuracy limits of prediction for side-chain conformations. , 2001 .

[19]  Z. Xiang,et al.  Extending the accuracy limits of prediction for side-chain conformations. , 2001, Journal of molecular biology.

[20]  Mona Singh,et al.  Solving and analyzing side-chain positioning problems using linear and integer programming , 2005, Bioinform..

[21]  Geoffrey K. Hom,et al.  Full-sequence computational design and solution structure of a thermostable protein variant. , 2007, Journal of molecular biology.

[22]  I. Lasters,et al.  Fast and accurate side‐chain topology and energy refinement (FASTER) as a new method for protein structure optimization , 2002, Proteins.

[23]  D. Baker,et al.  A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. , 2003, Journal of molecular biology.

[24]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.

[25]  Richard M. Karp,et al.  The traveling-salesman problem and minimum spanning trees: Part II , 1971, Math. Program..

[26]  Tommi S. Jaakkola,et al.  Tightening LP Relaxations for MAP using Message Passing , 2008, UAI.

[27]  A R Leach,et al.  Exploring the conformational space of protein side chains using dead‐end elimination and the A* algorithm , 1998, Proteins.

[28]  S. L. Mayo,et al.  Conformational splitting: A more powerful criterion for dead‐end elimination , 2000, J. Comput. Chem..

[29]  Christopher A. Voigt,et al.  Trading accuracy for speed: A quantitative comparison of search algorithms in protein sequence design. , 2000, Journal of molecular biology.

[30]  David Applegate,et al.  Finding Cuts in the TSP (A preliminary report) , 1995 .