Computational protein design with explicit consideration of surface hydrophobic patches

De novo protein design requires the identification of amino‐acid sequences that favor the target‐folded conformation and are soluble in water. One strategy for promoting solubility is to disallow hydrophobic residues on the protein surface during design. However, naturally occurring proteins often have hydrophobic amino acids on their surface that contribute to protein stability via the partial burial of hydrophobic surface area or play a key role in the formation of protein–protein interactions. A less restrictive approach for surface design that is used by the modeling program Rosetta is to parameterize the energy function so that the number of hydrophobic amino acids designed on the protein surface is similar to what is observed in naturally occurring monomeric proteins. Previous studies with Rosetta have shown that this limits surface hydrophobics to the naturally occurring frequency (∼ 28%), but that it does not prevent the formation of hydrophobic patches that are considerably larger than those observed in naturally occurring proteins. Here, we describe a new score term that explicitly detects and penalizes the formation of hydrophobic patches during computational protein design. With the new term, we are able to design protein surfaces that include hydrophobic amino acids at naturally occurring frequencies, but do not have large hydrophobic patches. By adjusting the strength of the new score term, the emphasis of surface redesigns can be switched between maintaining solubility and maximizing folding free energy. Proteins 2011. © 2012 Wiley Periodicals, Inc.

[1]  Michael J. Fischer,et al.  An improved equivalence algorithm , 1964, CACM.

[2]  C. Chothia The nature of the accessible and buried surfaces in proteins. , 1976, Journal of molecular biology.

[3]  K. Takano ON SOLUTION OF , 1983 .

[4]  C Chothia,et al.  Surface, subunit interfaces and interior of oligomeric proteins. , 1988, Journal of molecular biology.

[5]  Kenneth M. Merz,et al.  Rapid approximation to molecular surface area via the use of Boolean logic and look‐up tables , 1993, J. Comput. Chem..

[6]  G Schreiber,et al.  Stability and function: two constraints in the evolution of barstar and other proteins. , 1994, Structure.

[7]  G D Rose,et al.  Modeling unfolded states of peptides and proteins. , 1995, Biochemistry.

[8]  K. Dill,et al.  Designing amino acid sequences to fold with good hydrophobic cores. , 1995, Protein engineering.

[9]  Philip Lijnzaad,et al.  A method for detecting hydrophobic patches on protein surfaces , 1996, Proteins.

[10]  S L Mayo,et al.  De novo protein design: towards fully automated sequence selection. , 1997, Journal of molecular biology.

[11]  Roland L. Dunbrack,et al.  Bayesian statistical analysis of protein side‐chain rotamer preferences , 1997, Protein science : a publication of the Protein Society.

[12]  S. L. Mayo,et al.  Automated design of the surface positions of protein helices , 1997, Protein science : a publication of the Protein Society.

[13]  S. L. Mayo,et al.  Probing the role of packing specificity in protein design. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Volker Sieber,et al.  Surface‐exposed phenylalanines in the RNP1/RNP2 motif stabilize the cold‐shock protein CspB from Bacillus subtilis , 1998, Proteins.

[15]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[16]  D. Baker,et al.  Improved recognition of native‐like protein structures using a combination of sequence‐dependent and sequence‐independent features of proteins , 1999, Proteins.

[17]  M. Levitt,et al.  De novo protein design. I. In search of stability and specificity. , 1999, Journal of molecular biology.

[18]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[19]  D. Baker,et al.  Native protein sequences are close to optimal for their structures. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[20]  S J Wodak,et al.  Automatic protein design with all atom force-fields by exact and heuristic optimization. , 2000, Journal of molecular biology.

[21]  D Poso,et al.  Progressive Stabilization of Intermediate and Transition States in Protein Folding Reactions by Introducing Surface Hydrophobic Residues* , 2000, The Journal of Biological Chemistry.

[22]  Sarah A. Teichmann,et al.  Principles of protein-protein interactions , 2002, ECCB.

[23]  Lorenz Wernisch,et al.  Folding free energy function selects native-like protein sequences in the core but not on the surface , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[24]  L. Serrano,et al.  Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. , 2002, Journal of molecular biology.

[25]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[26]  D. Baker,et al.  An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. , 2003, Journal of molecular biology.

[27]  C. Dobson,et al.  Rationalization of the effects of mutations on peptide andprotein aggregation rates , 2003, Nature.

[28]  D. Baker,et al.  A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. , 2003, Journal of molecular biology.

[29]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[30]  P. Harbury,et al.  Automated design of specificity in molecular recognition , 2003, Nature Structural Biology.

[31]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[32]  N. Pokala,et al.  Energy functions for protein design: adjustment with protein-protein complex affinities, models for the unfolded state, and negative design of solubility and specificity. , 2005, Journal of molecular biology.

[33]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[34]  Feng Ding,et al.  Modeling backbone flexibility improves protein stability estimation. , 2007, Structure.

[35]  D. Baker,et al.  High-resolution Structural and Thermodynamic Analysis of Extreme Stabilization of Human Procarboxypeptidase by Computational Protein Design , 2007, Journal of molecular biology.

[36]  David R. Liu,et al.  Supercharging proteins can impart unusual resilience. , 2007, Journal of the American Chemical Society.

[37]  Jack Snoeyink,et al.  Maintaining solvent accessible surface area under rotamer substitution for protein design , 2007, J. Comput. Chem..

[38]  K. Henrick,et al.  Inference of macromolecular assemblies from crystalline state. , 2007, Journal of molecular biology.

[39]  Stephen L Mayo,et al.  Evaluating and optimizing computational protein design force fields using fixed composition-based negative design , 2008, Proceedings of the National Academy of Sciences.

[40]  Xiaozhen Hu,et al.  Computer-based redesign of a beta sandwich protein suggests that extensive negative design is not required for de novo beta sheet design. , 2008, Structure.

[41]  Bernhardt L. Trout,et al.  Design of therapeutic proteins with enhanced stability , 2009, Proceedings of the National Academy of Sciences.

[42]  L. Stamatatos,et al.  Computational protein design using flexible backbone remodeling and resurfacing: case studies in structure-based antigen design. , 2011, Journal of molecular biology.

[43]  D. Baker,et al.  Role of conformational sampling in computing mutation‐induced changes in protein structure and stability , 2011, Proteins.