Energy functions in de novo protein design: current challenges and future prospects.

In the past decade, a concerted effort to successfully capture specific tertiary packing interactions produced specific three-dimensional structures for many de novo designed proteins that are validated by nuclear magnetic resonance and/or X-ray crystallographic techniques. However, the success rate of computational design remains low. In this review, we provide an overview of experimentally validated, de novo designed proteins and compare four available programs, RosettaDesign, EGAD, Liang-Grishin, and RosettaDesign-SR, by assessing designed sequences computationally. Computational assessment includes the recovery of native sequences, the calculation of sizes of hydrophobic patches and total solvent-accessible surface area, and the prediction of structural properties such as intrinsic disorder, secondary structures, and three-dimensional structures. This computational assessment, together with a recent community-wide experiment in assessing scoring functions for interface design, suggests that the next-generation protein-design scoring function will come from the right balance of complementary interaction terms. Such balance may be found when more negative experimental data become available as part of a training set.

[1]  G M Edelman,et al.  Favin versus concanavalin A: Circularly permuted amino acid sequences. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[2]  P. Privalov Stability of proteins: small globular proteins. , 1979, Advances in protein chemistry.

[3]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[4]  L. Regan,et al.  Characterization of a helical protein designed from first principles. , 1988, Science.

[5]  D Eisenberg,et al.  Crystal structure of alpha 1: implications for protein design. , 1990, Science.

[6]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[7]  John C. Wootton,et al.  Statistics of Local Complexity in Amino Acid Sequences and Sequence Databases , 1993, Comput. Chem..

[8]  L. H. Bradley,et al.  Protein design by binary patterning of polar and nonpolar amino acids. , 1993, Methods in molecular biology.

[9]  Roland L. Dunbrack,et al.  Conformational analysis of the backbone-dependent rotamer preferences of protein sidechains , 1994, Nature Structural Biology.

[10]  J. Richardson,et al.  Betadoublet: de novo design, synthesis, and characterization of a beta-sandwich protein. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[11]  E. Milner-White,et al.  Coulombic interactions between partially charged main-chain atoms not hydrogen-bonded to each other influence the conformations of alpha-helices and antiparallel beta-sheet. A new method for analysing the forces between hydrogen bonding groups in proteins includes all the Coulombic interactions. , 1995, Journal of molecular biology.

[12]  K A Dill,et al.  Modeling protein stability as heteropolymer collapse. , 1995, Advances in protein chemistry.

[13]  K. Dill,et al.  Designing amino acid sequences to fold with good hydrophobic cores. , 1995, Protein engineering.

[14]  W. L. Jorgensen,et al.  Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids , 1996 .

[15]  Philip Lijnzaad,et al.  A method for detecting hydrophobic patches on protein surfaces , 1996, Proteins.

[16]  H W Hellinga,et al.  Construction of a catalytically active iron superoxide dismutase by rational protein design. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[17]  G. Schneider,et al.  Circular permutations of natural protein sequences: structural evidence. , 1997, Current opinion in structural biology.

[18]  S. L. Mayo,et al.  De novo protein design: fully automated sequence selection. , 1997, Science.

[19]  S. L. Mayo,et al.  Probing the role of packing specificity in protein design. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[20]  P. S. Kim,et al.  High-resolution protein design with backbone freedom. , 1998, Science.

[21]  J R Desjarlais,et al.  From coiled coils to small globular proteins: Design of a native‐like three‐helix bundle , 1998, Protein science : a publication of the Protein Society.

[22]  Volker Sieber,et al.  Surface‐exposed phenylalanines in the RNP1/RNP2 motif stabilize the cold‐shock protein CspB from Bacillus subtilis , 1998, Proteins.

[23]  Tanja Kortemme,et al.  Design of a 20-Amino Acid, Three-Stranded β-Sheet Protein , 1998 .

[24]  C. Deane,et al.  Carbonyl-carbonyl interactions stabilize the partially allowed Ramachandran conformations of asparagine and aspartic acid. , 1999, Protein engineering.

[25]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[26]  W. DeGrado,et al.  Solution structure and dynamics of a de novo designed three-helix bundle protein. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[27]  R. Glockshuber,et al.  Random circular permutation of DsbA reveals segments that are essential for protein folding and stability. , 1999, Journal of molecular biology.

[28]  D. Baker,et al.  Improved recognition of native‐like protein structures using a combination of sequence‐dependent and sequence‐independent features of proteins , 1999, Proteins.

[29]  T. Creamer Side‐chain conformational entropy in protein unfolded states , 2000, Proteins.

[30]  Automated protein design and sequence optimisation: scoring functions and the search problem. , 2000, Current protein & peptide science.

[31]  D. Baker,et al.  Native protein sequences are close to optimal for their structures. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[32]  S J Wodak,et al.  Automatic protein design with all atom force-fields by exact and heuristic optimization. , 2000, Journal of molecular biology.

[33]  Tsutomu Nakamura,et al.  Systematic circular permutation of an entire protein reveals essential folding elements , 2000, Nature Structural Biology.

[34]  D Poso,et al.  Progressive Stabilization of Intermediate and Transition States in Protein Folding Reactions by Introducing Surface Hydrophobic Residues* , 2000, The Journal of Biological Chemistry.

[35]  B. Rost Review: protein secondary structure prediction continues to rise. , 2001, Journal of structural biology.

[36]  P. Romero,et al.  Sequence complexity of disordered protein , 2001, Proteins.

[37]  Kam Y. J. Zhang,et al.  Conversion of monomeric protein L to an obligate dimer by computational protein design , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[38]  T M Handel,et al.  Review: protein design--where we were, where we are, where we're going. , 2001, Journal of structural biology.

[39]  Kam Y. J. Zhang,et al.  Accurate computer-based design of a new backbone conformation in the second turn of protein L. , 2002, Journal of molecular biology.

[40]  V. Rybin,et al.  Computer-aided design of a PDZ domain to recognize new target sequences , 2002, Nature Structural Biology.

[41]  Christopher A. Voigt,et al.  De novo design of biocatalysts. , 2002, Current opinion in chemical biology.

[42]  Julia M. Shifman,et al.  Modulating calmodulin binding specificity through computational protein design. , 2002, Journal of molecular biology.

[43]  Pinak Chakrabarti,et al.  Quantifying the accessible surface area of protein residues in their local environment. , 2002, Protein engineering.

[44]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[45]  Rama Ranganathan,et al.  Knowledge-based potential functions in protein design. , 2002, Current opinion in structural biology.

[46]  N. Grishin,et al.  Side‐chain modeling with an optimized scoring function , 2002, Protein science : a publication of the Protein Society.

[47]  L. Looger,et al.  Computational design of receptor and sensor proteins with novel functions , 2003, Nature.

[48]  C. Dobson,et al.  Rationalization of the effects of mutations on peptide andprotein aggregation rates , 2003, Nature.

[49]  S. L. Mayo,et al.  De novo backbone and sequence design of an idealized α/β-barrel protein: evidence of stable tertiary structure , 2003 .

[50]  D. Baker,et al.  A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. , 2003, Journal of molecular biology.

[51]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[52]  Ian W. Davis,et al.  Structure validation by Cα geometry: ϕ,ψ and Cβ deviation , 2003, Proteins.

[53]  Loren L Looger,et al.  Computational Design of a Biologically Active Enzyme , 2004, Science.

[54]  Navin Pokala,et al.  Energy functions for protein design I: Efficient and accurate continuum electrostatics and solvation , 2004, Protein science : a publication of the Protein Society.

[55]  Nick V Grishin,et al.  Effective scoring function for protein sequence design , 2003, Proteins.

[56]  D. Baker,et al.  Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[57]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[58]  R. Nussinov,et al.  In silico protein design by combinatorial assembly of protein building blocks , 2004, Protein science : a publication of the Protein Society.

[59]  D Thirumalai,et al.  Development of novel statistical potentials for protein fold recognition. , 2004, Current opinion in structural biology.

[60]  N. Pokala,et al.  Energy functions for protein design: adjustment with protein-protein complex affinities, models for the unfolded state, and negative design of solubility and specificity. , 2005, Journal of molecular biology.

[61]  Hongyi Zhou,et al.  Fold recognition by combining sequence profiles derived from evolution and from depth‐dependent structural alignment of fragments , 2004, Proteins.

[62]  Stephen L Mayo,et al.  Computationally designed variants of Escherichia coli chorismate mutase show altered catalytic activity. , 2005, Protein engineering, design & selection : PEDS.

[63]  M. Ota,et al.  Design of λ Cro Fold: Solution Structure of a Monomeric Variant of the De Novo Protein , 2005 .

[64]  François Diederich,et al.  Orthogonal multipolar interactions in structural chemistry and biology. , 2005, Angewandte Chemie.

[65]  Christopher T. Saunders,et al.  Recapitulation of protein family divergence using flexible backbone protein design. , 2005, Journal of molecular biology.

[66]  Stephen L Mayo,et al.  Electrostatics in computational protein design. , 2005, Current opinion in chemical biology.

[67]  Woody Sherman,et al.  Affinity enhancement of an in vivo matured therapeutic antibody using structure‐based computational design , 2006, Protein science : a publication of the Protein Society.

[68]  Yi Liu,et al.  RosettaDesign server for protein design , 2006, Nucleic Acids Res..

[69]  B. Kuhlman,et al.  Computational design of a single amino acid sequence that can switch between two distinct protein folds. , 2006, Journal of the American Chemical Society.

[70]  Brian Kuhlman,et al.  Design of protein conformational switches. , 2006, Current opinion in structural biology.

[71]  G. A. Lazar,et al.  Engineered antibody Fc variants with enhanced effector function. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[72]  Marianne Rooman,et al.  Development of Novel Statistical Potentials Describing Cation-pi Interactions in Proteins and Comparison with Semiempirical and Quantum Chemistry Approaches , 2006, J. Chem. Inf. Model..

[73]  David Baker,et al.  High-resolution structural validation of the computational redesign of human U1A protein. , 2006, Structure.

[74]  Rama Ranganathan,et al.  Knowledge-based potentials in protein design. , 2006, Current opinion in structural biology.

[75]  D. Baker,et al.  Computational redesign of endonuclease DNA binding and cleavage specificity , 2006, Nature.

[76]  D. Baker,et al.  High-resolution Structural and Thermodynamic Analysis of Extreme Stabilization of Human Procarboxypeptidase by Computational Protein Design , 2007, Journal of molecular biology.

[77]  Jenn-Kang Hwang,et al.  pKNOT: the protein KNOT web server , 2007, Nucleic Acids Res..

[78]  Peter Virnau,et al.  Protein knot server: detection of knots in protein structures , 2007, Nucleic Acids Res..

[79]  Bruce Tidor,et al.  Progress in computational protein design. , 2007, Current opinion in biotechnology.

[80]  K. Sharp,et al.  Potential energy functions for protein design. , 2007, Current opinion in structural biology.

[81]  Hongyi Zhou,et al.  What is a desirable statistical energy functions for proteins and how can it be obtained? , 2007, Cell Biochemistry and Biophysics.

[82]  Hong Cheng,et al.  De novo design of a single-chain diphenylporphyrin metalloprotein. , 2007, Journal of the American Chemical Society.

[83]  Geoffrey K. Hom,et al.  Full-sequence computational design and solution structure of a thermostable protein variant. , 2007, Journal of molecular biology.

[84]  Brian Kuhlman,et al.  High-resolution design of a protein loop , 2007, Proceedings of the National Academy of Sciences.

[85]  Xiaozhen Hu,et al.  Computer-based redesign of a beta sandwich protein suggests that extensive negative design is not required for de novo beta sheet design. , 2008, Structure.

[86]  Eric A. Althoff,et al.  De Novo Computational Design of Retro-Aldol Enzymes , 2008, Science.

[87]  C. Stordeur,et al.  The NMR solution structure of the artificial protein M7 matches the computationally designed model , 2008, Proteins.

[88]  Yaoqi Zhou,et al.  Specific interactions for ab initio folding of protein terminal regions with secondary structures , 2008, Proteins.

[89]  A Keith Dunker,et al.  Exploring the molecular design of protein interaction sites with molecular dynamics simulations and free energy calculations. , 2009, Biochemistry.

[90]  William R Taylor,et al.  Probing the "dark matter" of protein fold space. , 2009, Structure.

[91]  Tanja Kortemme,et al.  Backbone flexibility in computational protein design. , 2009, Current opinion in biotechnology.

[92]  Gevorg Grigoryan,et al.  Design of protein-interaction specificity affords selective bZIP-binding peptides , 2009, Nature.

[93]  Yaoqi Zhou,et al.  Improving the prediction accuracy of residue solvent accessibility and real‐value backbone torsion angles of proteins by guided‐learning through a two‐layer neural network , 2009, Proteins.

[94]  L. Lai,et al.  De Novo Design of a βαβ Motif , 2009 .

[95]  Christopher C. Moser,et al.  Design and engineering of an O(2) transport protein , 2009, Nature.

[96]  Alfonso Jaramillo,et al.  Challenges in the computational design of proteins , 2009, Journal of The Royal Society Interface.

[97]  Yuedong Yang,et al.  Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. , 2009, Structure.

[98]  Jianpeng Ma,et al.  Explicit orientation dependence in empirical potentials and its significance to side-chain modeling. , 2009, Accounts of chemical research.

[99]  Birte Höcker,et al.  Computational design of ligand binding is not a solved problem , 2009, Proceedings of the National Academy of Sciences.

[100]  Amy E Keating,et al.  X‐ray vs. NMR structures as templates for computational protein design , 2009, Proteins.

[101]  Yaoqi Zhou,et al.  Improving computational protein design by using structure‐derived sequence profile , 2010, Proteins.

[102]  Martin Zacharias,et al.  Binding site prediction and improved scoring during flexible protein–protein docking with ATTRACT , 2010, Proteins.

[103]  Jasmine L. Gallaher,et al.  Computational Design of an Enzyme Catalyst for a Stereoselective Bimolecular Diels-Alder Reaction , 2010, Science.

[104]  Alessandro Laio,et al.  Exploring the Universe of Protein Structures beyond the Protein Data Bank , 2010, PLoS Comput. Biol..

[105]  Lukasz Goldschmidt,et al.  Structure and folding of a designed knotted protein , 2010, Proceedings of the National Academy of Sciences.

[106]  Peter Virnau,et al.  Structures and folding pathways of topologically knotted proteins , 2011, Journal of physics. Condensed matter : an Institute of Physics journal.

[107]  Samuel L. DeLuca,et al.  Design of native-like proteins through an exposure-dependent environment potential. , 2011, Biochemistry.

[108]  Anna Tramontano,et al.  Evaluation of disorder predictions in CASP9 , 2011, Proteins.

[109]  Timothy A. Whitehead,et al.  Computational Design of Proteins Targeting the Conserved Stem Region of Influenza Hemagglutinin , 2011, Science.

[110]  Yaoqi Zhou,et al.  Protein side chain modeling with orientation‐dependent atomic force fields derived by series expansions , 2011, J. Comput. Chem..

[111]  Yaoqi Zhou,et al.  Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates , 2011, Bioinform..

[112]  Sumudu P. Leelananda,et al.  Multibody coarse‐grained potentials for native structure recognition and quality assessment of protein models , 2011, Proteins.

[113]  Rhiju Das,et al.  Four Small Puzzles That Rosetta Doesn't Solve , 2011, PloS one.

[114]  Kengo Kinoshita,et al.  Community-wide assessment of protein-interface modeling suggests improvements to design methodology. , 2011, Journal of molecular biology.

[115]  J. Meiler,et al.  Exploring symmetry as an avenue to the computational design of large protein domains. , 2011, Journal of the American Chemical Society.

[116]  Yaoqi Zhou,et al.  Characterizing the existing and potential structural space of proteins by large-scale multiple loop permutations. , 2011, Journal of molecular biology.

[117]  Yang Zhang,et al.  Computational protein design and large-scale assessment by I-TASSER structure assembly simulations. , 2011, Journal of molecular biology.

[118]  A Keith Dunker,et al.  SPINE-D: Accurate Prediction of Short and Long Disordered Regions by a Single Neural-Network Based Method , 2012, Journal of biomolecular structure & dynamics.

[119]  B. Kuhlman,et al.  Computational protein design with explicit consideration of surface hydrophobic patches , 2012, Proteins.

[120]  B. Kuhlman,et al.  Increasing sequence diversity with flexible backbone protein design: the complete redesign of a protein hydrophobic core. , 2012, Structure.

[121]  Bruce Randall Donald,et al.  Protein Design Using Continuous Rotamers , 2012, PLoS Comput. Biol..

[122]  Lukasz A. Kurgan,et al.  SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles , 2012, J. Comput. Chem..