Evaluating and optimizing computational protein design force fields using fixed composition-based negative design

An accurate force field is essential to computational protein design and protein fold prediction studies. Proper force field tuning is problematic, however, due in part to the incomplete modeling of the unfolded state. Here, we evaluate and optimize a protein design force field by constraining the amino acid composition of the designed sequences to that of a well behaved model protein. According to the random energy model, unfolded state energies are dependent only on amino acid composition and not the specific arrangement of amino acids. Therefore, energy discrepancies between computational predictions and experimental results, for sequences of identical composition, can be directly attributed to flaws in the force field's ability to properly account for folded state sequence energies. This aspect of fixed composition design allows for force field optimization by focusing solely on the interactions in the folded state. Several rounds of fixed composition optimization of the 56-residue β1 domain of protein G yielded force field parameters with significantly greater predictive power: Optimized sequences exhibited higher wild-type sequence identity in critical regions of the structure, and the wild-type sequence showed an improved Z-score. Experimental studies revealed a designed 24-fold mutant to be stably folded with a melting temperature similar to that of the wild-type protein. Sequence designs using engrailed homeodomain as a scaffold produced similar results, suggesting the tuned force field parameters were not specific to protein G.

[1]  V S Pande,et al.  Statistical mechanics of simple models of protein folding and design. , 1997, Biophysical journal.

[2]  S. A. Marshall,et al.  Achieving stability and conformational specificity in designed proteins via binary patterning. , 2001, Journal of molecular biology.

[3]  E I Shakhnovich,et al.  A test of lattice protein folding algorithms. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Benjamin D Allen,et al.  Combinatorial methods for small-molecule placement in computational enzyme design , 2006, Proceedings of the National Academy of Sciences.

[5]  Stephen L. Mayo,et al.  Design, structure and stability of a hyperthermophilic protein variant , 1998, Nature Structural Biology.

[6]  Andrew M Wollacott,et al.  Prediction of amino acid sequence from structure , 2000, Protein science : a publication of the Protein Society.

[7]  Geoffrey K. Hom,et al.  A search algorithm for fixed‐composition protein design , 2006, J. Comput. Chem..

[8]  S. L. Mayo,et al.  Protein design automation , 1996, Protein science : a publication of the Protein Society.

[9]  G. A. Lazar,et al.  De novo design of the hydrophobic core of ubiquitin , 1997, Protein science : a publication of the Protein Society.

[10]  Roland L. Dunbrack,et al.  Backbone-dependent rotamer library for proteins. Application to side-chain prediction. , 1993, Journal of molecular biology.

[11]  S. L. Mayo,et al.  De novo protein design: fully automated sequence selection. , 1997, Science.

[12]  M. Levitt,et al.  De novo protein design. II. Plasticity in sequence space. , 1999, Journal of molecular biology.

[13]  K. Chou,et al.  Prediction of protein structural classes. , 1995, Critical reviews in biochemistry and molecular biology.

[14]  I. Ghosh,et al.  Single-site mutations in a hyperthermophilic variant of the B1 domain of protein G result in self-assembled oligomers. , 2005, Biochemistry.

[15]  M. Levitt,et al.  De novo protein design. I. In search of stability and specificity. , 1999, Journal of molecular biology.

[16]  S. L. Mayo,et al.  DREIDING: A generic force field for molecular simulations , 1990 .

[17]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[18]  K. Dill,et al.  Denatured states of proteins. , 1991, Annual review of biochemistry.

[19]  Tom L Blundell,et al.  Advantages of fine-grained side chain conformer libraries. , 2003, Protein engineering.

[20]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[21]  R. Goldstein,et al.  Optimizing potentials for the inverse protein folding problem. , 1998, Protein engineering.

[22]  K. Takano ON SOLUTION OF , 1983 .

[23]  S. L. Mayo,et al.  Automated design of the surface positions of protein helices , 1997, Protein science : a publication of the Protein Society.

[24]  D. Baker,et al.  Native protein sequences are close to optimal for their structures. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[25]  K. Dill,et al.  Inverse protein folding problem: designing polymer sequences. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Stephen L. Mayo,et al.  Designing protein β-sheet surfaces by Z-score optimization , 2000 .

[27]  S. L. Mayo,et al.  Probing the role of packing specificity in protein design. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[28]  B. Derrida Random-Energy Model: Limit of a Family of Disordered Models , 1980 .

[29]  S J Wodak,et al.  Automatic protein design with all atom force-fields by exact and heuristic optimization. , 2000, Journal of molecular biology.

[30]  R A Goldstein,et al.  Optimizing energy potentials for success in protein tertiary structure prediction. , 1998, Folding & design.

[31]  S. A. Marshall,et al.  Energy functions for protein design. , 1999, Current opinion in structural biology.

[32]  Kuo-Chen Chou,et al.  Prediction protein structural classes with pseudo-amino acid composition: approximate entropy and hydrophobicity pattern. , 2008, Journal of theoretical biology.

[33]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.

[34]  E. Shakhnovich,et al.  Engineering of stable and fast-folding sequences of model proteins. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[35]  P. Wolynes,et al.  Spin glasses and the statistical mechanics of protein folding. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[36]  E. Shakhnovich,et al.  Formation of unique structure in polypeptide chains. Theoretical investigation with the aid of a replica approach. , 1989, Biophysical chemistry.

[37]  G L Gilliland,et al.  Structural studies of the engrailed homeodomain , 1994, Protein science : a publication of the Protein Society.

[38]  E. Shakhnovich,et al.  A new approach to the design of stable proteins. , 1993, Protein engineering.

[39]  S L Mayo,et al.  Pairwise calculation of protein solvent-accessible surface areas. , 1998, Folding & design.