Generalized dead-end elimination algorithms make large-scale protein side-chain structure prediction tractable: implications for protein design and structural genomics.

The dead-end elimination (DEE) theorems are powerful tools for the combinatorial optimization of protein side-chain placement in protein design and homology modeling. In order to reach their full potential, the theorems must be extended to handle very hard problems. We present a suite of new algorithms within the DEE paradigm that significantly extend its range of convergence and reduce run time. As a demonstration, we show that a total protein design problem of 10(115) combinations, a hydrophobic core design problem of 10(244) combinations, and a side-chain placement problem of 10(1044) combinations are solved in less than two weeks, a day and a half, and an hour of CPU time, respectively. This extends the range of the method by approximately 53, 144 and 851 log-units, respectively, using modest computational resources. Small to average-sized protein domains can now be designed automatically, and side-chain placement calculations can be solved for nearly all sizes of proteins and protein complexes in the growing field of structural genomics.

[1]  E. Reingold,et al.  Combinatorial Algorithms: Theory and Practice , 1977 .

[2]  R. Goldstein Efficient rotamer elimination applied to protein side-chains and related spin glasses. , 1994, Biophysical journal.

[3]  M. Levitt,et al.  De novo protein design. I. In search of stability and specificity. , 1999, Journal of molecular biology.

[4]  S. Subbiah,et al.  Prediction of protein side-chain conformation by packing optimization. , 1991, Journal of molecular biology.

[5]  J R Desjarlais,et al.  Computer search algorithms in protein modification and design. , 1998, Current opinion in structural biology.

[6]  I Lasters,et al.  Theoretical and algorithmical optimization of the dead-end elimination theorem. , 1997, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[7]  J. Richardson,et al.  Amino acid preferences for specific locations at the ends of alpha helices. , 1988, Science.

[8]  Stephen L. Mayo,et al.  Design, structure and stability of a hyperthermophilic protein variant , 1998, Nature Structural Biology.

[9]  D. R. Holland,et al.  Structural comparison suggests that thermolysin and related neutral proteases undergo hinge-bending motion during catalysis. , 1992, Biochemistry.

[10]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.

[11]  H. Farid,et al.  Prediction and evaluation of side‐chain conformations for protein backbone structures , 1996, Proteins.

[12]  J. Richardson,et al.  The penultimate rotamer library , 2000, Proteins.

[13]  M. Levitt,et al.  De novo protein design. II. Plasticity in sequence space. , 1999, Journal of molecular biology.

[14]  D. Benjamin Gordon,et al.  Radical performance enhancements for combinatorial optimization algorithms based on the dead‐end elimination theorem , 1998 .

[15]  D A Agard,et al.  Modeling side-chain conformation for homologous proteins using an energy-based rotamer search. , 1993, Journal of molecular biology.

[16]  P. Koehl,et al.  Application of a self-consistent mean field theory to predict protein side-chains conformation and estimate their conformational entropy. , 1994, Journal of molecular biology.

[17]  J. Ponder,et al.  Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. , 1987, Journal of molecular biology.

[18]  S. L. Mayo,et al.  Automated design of the surface positions of protein helices , 1997, Protein science : a publication of the Protein Society.

[19]  M. Karplus,et al.  X-ray refinement of protein structures by simulated annealing: test of the method on myohemerythrin. , 1989, Acta crystallographica. Section A, Foundations of crystallography.

[20]  S. A. Marshall,et al.  Energy functions for protein design. , 1999, Current opinion in structural biology.

[21]  D. Benjamin Gordon,et al.  Radical performance enhancements for combinatorial optimization algorithms based on the dead-end elimination theorem , 1998, Journal of Computational Chemistry.

[22]  I Lasters,et al.  Enhanced dead-end elimination in the search for the global minimum energy conformation of a collection of protein side chains. , 1995, Protein engineering.

[23]  I Lasters,et al.  All in one: a highly detailed rotamer library improves both accuracy and speed in the modelling of sidechains by dead-end elimination. , 1997, Folding & design.

[24]  G. A. Lazar,et al.  De novo design of the hydrophobic core of ubiquitin , 1997, Protein science : a publication of the Protein Society.

[25]  L. Beamer,et al.  Refined 1.8 p crystal structure of the ? repressor-operator complex*1 , 1992 .

[26]  R. Nussinov,et al.  Favorable domain size in proteins. , 1998, Folding & design.

[27]  David T. Jones,et al.  De novo protein design using pairwise potentials and a genetic algorithm , 1994, Protein science : a publication of the Protein Society.

[28]  Christopher A. Voigt,et al.  Trading accuracy for speed: A quantitative comparison of search algorithms in protein sequence design. , 2000, Journal of molecular biology.

[29]  H. W. Hellinga,et al.  Computational protein engineering , 1998, Nature Structural Biology.

[30]  S. L. Mayo,et al.  Conformational splitting: A more powerful criterion for dead‐end elimination , 2000, J. Comput. Chem..

[31]  S L Mayo,et al.  Pairwise calculation of protein solvent-accessible surface areas. , 1998, Folding & design.

[32]  D B Gordon,et al.  Branch-and-terminate: a combinatorial optimization algorithm for protein design. , 1999, Structure.

[33]  G. Petsko,et al.  Three-dimensional structure of murine anti-p-azophenylarsonate Fab 36-71. 1. X-ray crystallography, site-directed mutagenesis, and modeling of the complex with hapten. , 1991, Biochemistry.

[34]  J R Desjarlais,et al.  De novo design of the hydrophobic cores of proteins , 1995, Protein science : a publication of the Protein Society.

[35]  S. L. Mayo,et al.  Computational protein design. , 1999, Structure.

[36]  S Subbiah,et al.  A simulated annealing approach to the search problem of protein crystallography. , 1989, Acta crystallographica. Section A, Foundations of crystallography.

[37]  C. Pabo,et al.  Refined 1.8 A crystal structure of the lambda repressor-operator complex. , 1992, Journal of molecular biology.

[38]  R J Read,et al.  Extending the limits of molecular replacement through combined simulated annealing and maximum-likelihood refinement. , 1999, Acta crystallographica. Section D, Biological crystallography.

[39]  G T Montelione,et al.  Homology modeling using simulated annealing of restrained molecular dynamics and conformational search calculations with CONGEN: Application in predicting the three‐dimensional structure of murine homeodomain Msx‐1 , 1997, Protein science : a publication of the Protein Society.

[40]  D C Richardson,et al.  Asparagine and glutamine rotamers: B-factor cutoff and correction of amide flips yield distinct clustering. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[41]  S. L. Mayo,et al.  Protein design automation , 1996, Protein science : a publication of the Protein Society.

[42]  E Ruoslahti,et al.  Crystal structure of the tenth type III cell adhesion module of human fibronectin. , 1994, Journal of molecular biology.

[43]  F M Richards,et al.  Optimal sequence selection in proteins of known structure by simulated evolution. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Alexander D. MacKerell,et al.  All-atom empirical potential for molecular modeling and dynamics studies of proteins. , 1998, The journal of physical chemistry. B.

[45]  A. Volbeda,et al.  Crystal structures of the key anaerobic enzyme pyruvate:ferredoxin oxidoreductase, free and in complex with pyruvate , 1999, Nature Structural Biology.

[46]  Junichi Takagi,et al.  Computational design of an integrin I domain stabilized in the open high affinity conformation , 2000, Nature Structural Biology.

[47]  B. Matthews,et al.  The conformation of thermolysin. , 1974, The Journal of biological chemistry.

[48]  C. Sander,et al.  Positioning hydrogen atoms by optimizing hydrogen‐bond networks in protein structures , 1996, Proteins.

[49]  S. L. Mayo,et al.  De novo protein design: fully automated sequence selection. , 1997, Science.

[50]  R. Huber,et al.  Accurate Bond and Angle Parameters for X-ray Protein Structure Refinement , 1991 .