Improved Pruning algorithms and Divide-and-Conquer strategies for Dead-End Elimination, with application to protein design

MOTIVATION Structure-based protein redesign can help engineer proteins with desired novel function. Improving computational efficiency while still maintaining the accuracy of the design predictions has been a major goal for protein design algorithms. The combinatorial nature of protein design results both from allowing residue mutations and from the incorporation of protein side-chain flexibility. Under the assumption that a single conformation can model protein folding and binding, the goal of many algorithms is the identification of the Global Minimum Energy Conformation (GMEC). A dominant theorem for the identification of the GMEC is Dead-End Elimination (DEE). DEE-based algorithms have proven capable of eliminating the majority of candidate conformations, while guaranteeing that only rotamers not belonging to the GMEC are pruned. However, when the protein design process incorporates rotameric energy minimization, DEE is no longer provably-accurate. Hence, with energy minimization, the minimized-DEE (MinDEE) criterion must be used instead. RESULTS In this paper, we present provably-accurate improvements to both the DEE and MinDEE criteria. We show that our novel enhancements result in a speedup of up to a factor of more than 1000 when applied in redesign for three different proteins: Gramicidin Synthetase A, plastocyanin, and protein G. AVAILABILITY Contact authors for source code.

[1]  U. Singh,et al.  A NEW FORCE FIELD FOR MOLECULAR MECHANICAL SIMULATION OF NUCLEIC ACIDS AND PROTEINS , 1984 .

[2]  J. Guss,et al.  The crystal structure of poplar apoplastocyanin at 1.8-A resolution. The geometry of the copper-binding site is created by the polypeptide. , 1984, The Journal of biological chemistry.

[3]  J. Ponder,et al.  Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. , 1987, Journal of molecular biology.

[4]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.

[5]  I. Lasters,et al.  The fuzzy-end elimination theorem: correctly implementing the side chain placement algorithm based on the dead-end elimination theorem. , 1993, Protein engineering.

[6]  G L Gilliland,et al.  Two crystal structures of the B1 immunoglobulin-binding domain of streptococcal protein G and comparison with NMR. , 1994, Biochemistry.

[7]  R. Goldstein Efficient rotamer elimination applied to protein side-chains and related spin glasses. , 1994, Biophysical journal.

[8]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules J. Am. Chem. Soc. 1995, 117, 5179−5197 , 1996 .

[9]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[10]  P. Brick,et al.  Structural basis for the activation of phenylalanine in the non‐ribosomal biosynthesis of gramicidin S , 1997, The EMBO journal.

[11]  I Lasters,et al.  Theoretical and algorithmical optimization of the dead-end elimination theorem. , 1997, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[12]  A R Leach,et al.  Exploring the conformational space of protein side chains using dead‐end elimination and the A* algorithm , 1998, Proteins.

[13]  D. Benjamin Gordon,et al.  Radical performance enhancements for combinatorial optimization algorithms based on the dead-end elimination theorem , 1998, Journal of Computational Chemistry.

[14]  S. L. Mayo,et al.  Computational protein design. , 1999, Structure.

[15]  Rafael Najmanovich,et al.  Side‐chain flexibility in proteins upon ligand binding , 2000, Proteins.

[16]  J. Richardson,et al.  The penultimate rotamer library , 2000, Proteins.

[17]  D. Baker,et al.  Native protein sequences are close to optimal for their structures. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[18]  S. L. Mayo,et al.  Conformational splitting: A more powerful criterion for dead‐end elimination , 2000, J. Comput. Chem..

[19]  Petre Stoica,et al.  A Semidefinite Programming Approach to ARMA Estimation , 2000 .

[20]  S. L. Mayo,et al.  Enzyme-like proteins by computational design , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  J. Marvin,et al.  Conversion of a maltose receptor into a zinc biosensor by computational design , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  L L Looger,et al.  Generalized dead-end elimination algorithms make large-scale protein side-chain structure prediction tractable: implications for protein design and structural genomics. , 2001, Journal of molecular biology.

[23]  S. Wodak,et al.  Automatic procedures for protein design. , 2001, Combinatorial chemistry & high throughput screening.

[24]  I. Lasters,et al.  Fast and accurate side‐chain topology and energy refinement (FASTER) as a new method for protein structure optimization , 2002, Proteins.

[25]  Niles A Pierce,et al.  Protein design is NP-hard. , 2002, Protein engineering.

[26]  L. Looger,et al.  Computational design of receptor and sensor proteins with novel functions , 2003, Nature.

[27]  D. Benjamin Gordon,et al.  Exact rotamer optimization for protein design , 2003, J. Comput. Chem..

[28]  W. Jin,et al.  De novo design of foldable proteins with smooth folding funnel: automated negative design and experimental verification. , 2003, Structure.

[29]  Geoffrey K. Hom,et al.  Preprocessing of rotamers for protein design calculations , 2004, J. Comput. Chem..

[30]  Mona Singh,et al.  A Semidefinite Programming Approach to Side Chain Positioning with New Rounding Strategies , 2004, INFORMS J. Comput..

[31]  Bruce Randall Donald,et al.  A Novel Ensemble-Based Scoring and Search Algorithm for Protein Redesign and Its Application to Modify the Substrate Specificity of the Gramicidin Synthetase A Phenylalanine Adenylation Enzyme , 2005, J. Comput. Biol..

[32]  Bruce Randall Donald,et al.  A Novel Minimized Dead-End Elimination Criterion and Its Application to Protein Redesign in a Hybrid Scoring and Search Algorithm for Computing Partition Functions over Molecular Ensembles , 2006, RECOMB.