Coarse-graining protein energetics in sequence variables.

We show that cluster expansions (CE), previously used to model solid-state materials with binary or ternary configurational disorder, can be extended to the protein design problem. We present a generalized CE framework, in which properties such as energy can be unambiguously expanded in the amino-acid sequence space. The CE coarse grains over nonsequence degrees of freedom (e.g., side-chain conformations) and thereby simplifies the problem of designing proteins, or predicting the compatibility of a sequence with a given structure, by many orders of magnitude. The CE is physically transparent, and can be evaluated through linear regression on the energies of training sequences. We show, as example, that good prediction accuracy is obtained with up to pairwise interactions for a coiled-coil backbone, and that triplet interactions are important in the energetics of a more globular zinc-finger backbone.

[1]  Gevorg Grigoryan,et al.  Design of a Heterospecific, Tetrameric, 21-Residue Miniprotein with Mixed α/β Structure , 2005 .

[2]  Loren L Looger,et al.  Computational Design of a Biologically Active Enzyme , 2004, Science.

[3]  Loren L Looger,et al.  Computational design of receptors for an organophosphate surrogate of the nerve agent soman. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Alex Zunger,et al.  Structural complexity in binary bcc ground states: The case of bcc Mo-Ta , 2004 .

[5]  Jessica H. Fong,et al.  Predicting specificity in bZIP coiled-coil protein interactions , 2004, Genome Biology.

[6]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[7]  Michele Vendruscolo,et al.  Protein folding: bringing theory and experiment closer together. , 2003, Current opinion in structural biology.

[8]  C. Vinson,et al.  A heterodimerizing leucine zipper coiled coil system for examining the specificity of a position interactions: amino acids I, V, L, N, A, and K. , 2002, Biochemistry.

[9]  Raphael Guerois,et al.  Energy estimation in protein design. , 2002, Current opinion in structural biology.

[10]  Rama Ranganathan,et al.  Knowledge-based potential functions in protein design. , 2002, Current opinion in structural biology.

[11]  V. Ozoliņš,et al.  Incorporating first-principles energetics in computational thermodynamics approaches , 2002 .

[12]  G. Ceder,et al.  Automating First-Principles Phase Diagram Calculations , 2002, cond-mat/0201511.

[13]  A. Wollacott,et al.  Computational protein design. , 2001, Current opinion in chemical biology.

[14]  T M Handel,et al.  Review: protein design--where we were, where we are, where we're going. , 2001, Journal of structural biology.

[15]  L Serrano,et al.  Protein design based on folding models. , 2001, Current opinion in structural biology.

[16]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[17]  S. L. Mayo,et al.  De novo protein design: fully automated sequence selection. , 1997, Science.

[18]  G. Ceder,et al.  A Model to Predict Ionic Disorder and Phase Diagrams: Application to CaO-MgO, Gd2O3-ZrO2, and to Sodium β′′-alumina , 1997 .

[19]  Ceder,et al.  Nonempirical phase equilibria in the W-Mo-Cr system. , 1995, Physical review. B, Condensed matter.

[20]  Johnson,et al.  Commensurate and incommensurate ordering tendencies in the ternary fcc Cu-Ni-Zn system. , 1995, Physical review letters.

[21]  Ceder,et al.  Linear-programming method for obtaining effective cluster interactions in alloys from total-energy calculations: Application to the fcc Pd-V system. , 1995, Physical review. B, Condensed matter.

[22]  C. Vinson,et al.  A thermodynamic scale for leucine zipper stability and dimerization specificity: e and g interhelical interactions. , 1994, The EMBO journal.

[23]  R. Goldstein Efficient rotamer elimination applied to protein side-chains and related spin glasses. , 1994, Biophysical journal.

[24]  G. Ceder A derivation of the Ising model for the computation of phase diagrams , 1993 .

[25]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.

[26]  P. S. Kim,et al.  Mechanism of specificity in the Fos-Jun oncoprotein heterodimer , 1992, Cell.

[27]  F. Ducastelle,et al.  Generalized cluster description of multicomponent systems , 1984 .

[28]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[29]  F. Crick,et al.  The packing of α‐helices: simple coiled‐coils , 1953 .