Enumerating Designing Sequences in the HP Model

The hydrophobic/polar HP model on the square lattice has been widely used toinvestigate basics of protein folding. In the cases where all designing sequences (sequences with unique ground states) were enumerated without restrictions on the number of contacts, the upper limit on the chain length N has been 18–20 because of the rapid exponential growth of thenumbers of conformations and sequences. We show how a few optimizations push this limit by about 5 units. Based on these calculations, we study the statistical distribution of hydrophobicity along designing sequences. We find that the average number of hydrophobic and polar clumps along the chains is larger for designing sequences than for random ones, which is in agreement with earlier findings for N ≤ 18 and with results for real enzymes. We also show that this deviation from randomness disappears if the calculations are restricted to maximally compact structures.

[1]  A Irbäck,et al.  Evidence for nonrandom hydrophobicity structures in protein chains. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[2]  E I Shakhnovich,et al.  A test of lattice protein folding algorithms. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Michael Q. Zhang,et al.  Computational Methods for Protein Folding: Scaling a Hierarchy of Complexities , 2002 .

[4]  Erik Sandelin,et al.  Local interactions and protein folding : A model study on the square and triangular lattices , 1997, cond-mat/9708049.

[5]  S H White,et al.  Statistical distribution of hydrophobic residues along the length of protein chains. Implications for protein folding and evolution. , 1990, Biophysical journal.

[6]  Nicolas E. Buchler,et al.  Effect of alphabet size and foldability requirements on protein structure designability , 1999, Proteins.

[7]  K. Dill,et al.  Origins of structure in globular proteins. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[8]  V. Shahrezaei,et al.  Geometry Selects Highly Designable Structures , 2000, cond-mat/0009256.

[9]  K. Dill,et al.  A lattice statistical mechanics model of the conformational and sequence spaces of proteins , 1989 .

[10]  G. Torrie,et al.  Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling , 1977 .

[11]  Richard A. Goldstein,et al.  Surveying determinants of protein structure designability across different energy models and amino-acid alphabets: A consensus , 2000 .

[12]  A Irbäck,et al.  On hydrophobicity correlations in protein chains. , 2000, Biophysical journal.

[13]  K. Dill,et al.  Comparing folding codes for proteins and polymers , 1996, Proteins.

[14]  I. Kanter,et al.  STATISTICAL PROPERTIES OF CONTACT MAPS , 1998, cond-mat/9810285.

[15]  K. Dill,et al.  Transition states and folding dynamics of proteins and heteropolymers , 1994 .

[16]  D. Yee,et al.  Principles of protein folding — A perspective from simple exact models , 1995, Protein science : a publication of the Protein Society.

[17]  E I Shakhnovich,et al.  Impact of local and non-local interactions on thermodynamics and kinetics of protein folding. , 1995, Journal of molecular biology.

[18]  N. Madras,et al.  THE SELF-AVOIDING WALK , 2006 .

[19]  R. Jernigan,et al.  TRANSFER MATRIX METHOD FOR ENUMERATION AND GENERATION OF COMPACT SELF-AVOIDING WALKS. II. CUBIC LATTICE , 1998 .

[20]  J. Hirst,et al.  The evolutionary landscape of functional model proteins. , 1999, Protein engineering.

[21]  A Irbäck,et al.  Design of sequences with good folding properties in coarse-grained protein models. , 1999, Structure.

[22]  V. Shahrezaei,et al.  Protein ground state candidates in a simple model: an enumeration study. , 1999, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.