LUTE (Local Unpruned Tuple Expansion): Accurate Continuously Flexible Protein Design with General Energy Functions and Rigid-rotamer-like Efficiency

Most protein design algorithms search over discrete conformations and an energy function that is residue-pairwise, i.e., a sum of terms that depend on the sequence and conformation of at most two residues. Although modeling of continuous flexibility and of non-residue-pairwise energies significantly increases the accuracy of protein design, previous methods to model these phenomena add a significant asymptotic cost to design calculations. We now remove this cost by modeling continuous flexibility and non-residue-pairwise energies in a form suitable for direct input to highly efficient, discrete combinatorial optimization algorithms like DEE/A* or Branch-Width Minimization. Our novel algorithm performs a local unpruned tuple expansion (LUTE), which can efficiently represent both continuous flexibility and general, possibly non-pairwise energy functions to an arbitrary level of accuracy using a discrete energy matrix. We show using 47 design calculation test cases that LUTE provides a dramatic speedup in both single-state and multistate continuously flexible designs.

[1]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.

[2]  Bruce Randall Donald,et al.  Dead-End Elimination with Backbone Flexibility , 2007, ISMB/ECCB.

[3]  Emil Alexov,et al.  Rapid grid‐based construction of the molecular surface and the use of induced surface charge to calculate reaction field energies: Applications to the molecular systems and geometric objects , 2002, J. Comput. Chem..

[4]  John Z. H. Zhang,et al.  Molecular fractionation with conjugate caps for full quantum mechanical calculation of protein-molecule interaction energy , 2003 .

[5]  Simon de Givry,et al.  A new framework for computational protein design through cost function network optimization , 2013, Bioinform..

[6]  D. Benjamin Gordon,et al.  Radical performance enhancements for combinatorial optimization algorithms based on the dead‐end elimination theorem , 1998 .

[7]  Bruce Randall Donald,et al.  Algorithms in Structural Molecular Biology , 2011 .

[8]  Jacob M Litman,et al.  Dead-End Elimination with a Polarizable Force Field Repacks PCNA Structures. , 2015, Biophysical journal.

[9]  Fei Zhou,et al.  Ultra-Fast Evaluation of Protein Energies Directly from Sequence , 2006, PLoS Comput. Biol..

[10]  Bruce Randall Donald,et al.  Fast search algorithms for computational protein design , 2016, J. Comput. Chem..

[11]  Niles A Pierce,et al.  Protein design is NP-hard. , 2002, Protein engineering.

[12]  Amy C. Anderson,et al.  Computational structure-based redesign of enzyme activity , 2009, Proceedings of the National Academy of Sciences.

[13]  Pablo Gainza,et al.  Fast gap‐free enumeration of conformations and sequences for protein design , 2015, Proteins.

[14]  Thomas Schiex,et al.  Deterministic Search Methods for Computational Protein Design. , 2017, Methods in molecular biology.

[15]  Thomas Schiex,et al.  Guaranteed Discrete Energy Optimization on Large Protein Design Problems. , 2015, Journal of chemical theory and computation.

[16]  Rodney J Bartlett,et al.  A natural linear scaling coupled-cluster method. , 2004, The Journal of chemical physics.

[17]  Bruce Randall Donald,et al.  A Novel Ensemble-Based Scoring and Search Algorithm for Protein Redesign and Its Application to Modify the Substrate Specificity of the Gramicidin Synthetase A Phenylalanine Adenylation Enzyme , 2005, J. Comput. Biol..

[18]  Pablo Gainza,et al.  Compact Representation of Continuous Energy Surfaces for More Efficient Protein Design. , 2015, Journal of chemical theory and computation.

[19]  Bruce R Donald,et al.  Allosteric inhibition of the protein-protein interaction between the leukemia-associated proteins Runx1 and CBFbeta. , 2007, Chemistry & biology.

[20]  M. Levitt,et al.  Conformation of amino acid side-chains in proteins. , 1978, Journal of molecular biology.

[21]  B. Kuhlman,et al.  Computational design of affinity and specificity at protein-protein interfaces. , 2009, Current opinion in structural biology.

[22]  D. Baker,et al.  Native protein sequences are close to optimal for their structures. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Chen Zeng,et al.  An improved pairwise decomposable finite‐difference Poisson–Boltzmann method for computational protein design , 2008, J. Comput. Chem..

[24]  Pablo Gainza,et al.  Osprey: Protein Design with Ensembles, Flexibility, and Provable Algorithms , 2022 .

[25]  Bruce Randall Donald,et al.  Comets (Constrained Optimization of Multistate Energies by Tree Search): A Provable and Efficient Algorithm to Optimize Binding Affinity and Specificity with Respect to Sequence , 2015, RECOMB.

[26]  Bruce R Donald,et al.  Redesigning the PheA domain of gramicidin synthetase leads to a new understanding of the enzyme's mechanism and selectivity. , 2006, Biochemistry.

[27]  Young Do Kwon,et al.  Enhanced Potency of a Broadly Neutralizing HIV-1 Antibody In Vitro Improves Protection against Lentiviral Infection In Vivo , 2014, Journal of Virology.

[28]  Michal Sharon,et al.  Mechanism of auxin perception by the TIR1 ubiquitin ligase , 2007, Nature.

[29]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[30]  Mona Singh,et al.  A Semidefinite Programming Approach to Side Chain Positioning with New Rounding Strategies , 2004, INFORMS J. Comput..

[31]  Menachem Fromer,et al.  Dead‐end elimination for multistate protein design , 2007, J. Comput. Chem..

[32]  Bruce Randall Donald,et al.  A Novel Minimized Dead-End Elimination Criterion and Its Application to Protein Redesign in a Hybrid Scoring and Search Algorithm for Computing Partition Functions over Molecular Ensembles , 2006, RECOMB.

[33]  Jiří Čížek,et al.  On the Use of the Cluster Expansion and the Technique of Diagrams in Calculations of Correlation Effects in Atoms and Molecules , 2007 .

[34]  Bruce R Donald,et al.  Predicting resistance mutations using protein design algorithms , 2010, Proceedings of the National Academy of Sciences.

[35]  B. Honig,et al.  A rapid finite difference algorithm, utilizing successive over‐relaxation to solve the Poisson–Boltzmann equation , 1991 .

[36]  Mona Singh,et al.  Solving and analyzing side-chain positioning problems using linear and integer programming , 2005, Bioinform..

[37]  Mark A Hallen,et al.  Dead‐end elimination with perturbations (DEEPer): A provable protein design algorithm with continuous sidechain and backbone flexibility , 2013, Proteins.

[38]  A. Wernimont,et al.  Crystal structure of the Atx1 metallochaperone protein at 1.02 A resolution. , 1999, Structure.

[39]  Bruce Randall Donald,et al.  comets (Constrained Optimization of Multistate Energies by Tree Search): A Provable and Efficient Protein Design Algorithm to Optimize Binding Affinity and Specificity with Respect to Sequence , 2016, J. Comput. Biol..

[40]  M. Levitt,et al.  Accurate prediction of the stability and activity effects of site-directed mutagenesis on a protein core , 1991, Nature.

[41]  Bruce Randall Donald,et al.  Computational Design of a PDZ Domain Peptide Inhibitor that Rescues CFTR Activity , 2012, PLoS Comput. Biol..

[42]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[43]  Bonnie Berger,et al.  Fast and accurate algorithms for protein side-chain packing , 2006, JACM.

[44]  Gevorg Grigoryan,et al.  Design of protein-interaction specificity affords selective bZIP-binding peptides , 2009, Nature.

[45]  Bruce Randall Donald,et al.  Protein Design Using Continuous Rotamers , 2012, PLoS Comput. Biol..

[46]  Andrew Leaver-Fay,et al.  A Generic Program for Multistate Protein Design , 2011, PloS one.

[47]  Gwo-Yu Chuang,et al.  Antibodies VRC01 and 10E8 Neutralize HIV-1 with High Breadth and Potency Even with Ig-Framework Regions Substantially Reverted to Germline , 2014, The Journal of Immunology.

[48]  Bruce Randall Donald,et al.  BWM*: A Novel, Provable, Ensemble-based Dynamic Programming Algorithm for Sparse Approximations of Computational Protein Design , 2016, J. Comput. Biol..

[49]  K. Sharp,et al.  Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models , 1994 .

[50]  Y Li,et al.  Design of epitope-specific probes for sera analysis and antibody isolation , 2012, Retrovirology.

[51]  A R Leach,et al.  Exploring the conformational space of protein side chains using dead‐end elimination and the A* algorithm , 1998, Proteins.