Protein design is NP-hard.

Biologists working in the area of computational protein design have never doubted the seriousness of the algorithmic challenges that face them in attempting in silico sequence selection. It turns out that in the language of the computer science community, this discrete optimization problem is NP-hard. The purpose of this paper is to explain the context of this observation, to provide a simple illustrative proof and to discuss the implications for future progress on algorithms for computational protein design.

[1]  J R Desjarlais,et al.  Computer search algorithms in protein modification and design. , 1998, Current opinion in structural biology.

[2]  Solomon Eyal Shimony,et al.  Finding MAPs for Belief Networks is NP-Hard , 1994, Artif. Intell..

[3]  D. Benjamin Gordon,et al.  Radical performance enhancements for combinatorial optimization algorithms based on the dead‐end elimination theorem , 1998 .

[4]  J R Desjarlais,et al.  De novo design of the hydrophobic cores of proteins , 1995, Protein science : a publication of the Protein Society.

[5]  S. L. Mayo,et al.  Conformational splitting: A more powerful criterion for dead‐end elimination , 2000, J. Comput. Chem..

[6]  Daniel Kahneman,et al.  Probabilistic reasoning , 1993 .

[7]  Thomas Stützle,et al.  Local Search Algorithms for SAT: An Empirical Evaluation , 2000, Journal of Automated Reasoning.

[8]  M. Levitt,et al.  De novo protein design. I. In search of stability and specificity. , 1999, Journal of molecular biology.

[9]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.

[10]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[11]  Christopher A. Voigt,et al.  Trading accuracy for speed: A quantitative comparison of search algorithms in protein sequence design. , 2000, Journal of molecular biology.

[12]  D B Gordon,et al.  Branch-and-terminate: a combinatorial optimization algorithm for protein design. , 1999, Structure.

[13]  S. L. Mayo,et al.  De novo protein design: fully automated sequence selection. , 1997, Science.

[14]  S J Wodak,et al.  Automatic protein design with all atom force-fields by exact and heuristic optimization. , 2000, Journal of molecular biology.

[15]  A R Leach,et al.  Exploring the conformational space of protein side chains using dead‐end elimination and the A* algorithm , 1998, Proteins.

[16]  R. Goldstein Efficient rotamer elimination applied to protein side-chains and related spin glasses. , 1994, Biophysical journal.

[17]  John E. Hopcroft,et al.  Complexity of Computer Computations , 1974, IFIP Congress.

[18]  Roland L. Dunbrack,et al.  Backbone-dependent rotamer library for proteins. Application to side-chain prediction. , 1993, Journal of molecular biology.

[19]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[20]  Junichi Takagi,et al.  Computational design of an integrin I domain stabilized in the open high affinity conformation , 2000, Nature Structural Biology.

[21]  Teofilo F. Gonzalez,et al.  P-Complete Approximation Problems , 1976, J. ACM.

[22]  J. Ponder,et al.  Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. , 1987, Journal of molecular biology.

[23]  S. A. Marshall,et al.  Energy functions for protein design. , 1999, Current opinion in structural biology.

[24]  M. Levitt,et al.  Conformation of amino acid side-chains in proteins. , 1978, Journal of molecular biology.

[25]  D. Benjamin Gordon,et al.  Exact rotamer optimization for protein design , 2003, J. Comput. Chem..

[26]  L L Looger,et al.  Generalized dead-end elimination algorithms make large-scale protein side-chain structure prediction tractable: implications for protein design and structural genomics. , 2001, Journal of molecular biology.

[27]  Yair Weiss,et al.  Correctness of Local Probability Propagation in Graphical Models with Loops , 2000, Neural Computation.

[28]  Sanjeev Arora,et al.  Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems , 1998, JACM.

[29]  Stephen L. Mayo,et al.  Design, structure and stability of a hyperthermophilic protein variant , 1998, Nature Structural Biology.

[30]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[31]  S. L. Mayo,et al.  Protein design automation , 1996, Protein science : a publication of the Protein Society.