Protein structural codes and nucleation sites for protein folding

One of the long-standing controversial arguments in protein folding is Levinthal's paradox. We have recently proposed a new nucleation hypothesis and shown that the nucleation residues are the most conserved sequences in protein. To avoid the complicated effect of tertiary interactions, we limit our search for structural codes to the nucleation residues. Starting with the hypotheses of secondary structure nucleation and conservation of residues important for folding, we have analysed 762 folds classified as unique by SCOP. Segments of 17 residues around the top 20% conserved amino acids are analysed, resulting in approximately 100 clusters each for the main secondary structure classes of helix, sheet and coil. Helical clusters have the longest correlation range, coils the shortest (four residues). Strong specific sequence-structure correlation is observed for coil but not for helix and sheet, suggesting a mapping relationship between the sequence and the structure for coil. We propose that the central sequences in these clusters form `structural codes', a useful basis set for identifying nucleation sites, protein fragments stable in isolation, and secondary structural patterns in proteins (particularly turns and loops).

[1]  F. Jiang Scaling laws for folding native protein structures , 2005 .

[2]  L. Looger,et al.  Computational design of receptor and sensor proteins with novel functions , 2003, Nature.

[3]  Robert L. Baldwin,et al.  Tests of the helix dipole model for stabilization of α-helices , 1987, Nature.

[4]  Tim J. P. Hubbard,et al.  SCOP database in 2002: refinements accommodate structural genomics , 2002, Nucleic Acids Res..

[5]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[6]  S. Hagen,et al.  Internal friction controls the speed of protein folding from a compact configuration. , 2004, Biochemistry.

[7]  Alan R. Fersht,et al.  Capping and α-helix stability , 1989, Nature.

[8]  P. Y. Chou,et al.  Prediction of protein conformation. , 1974, Biochemistry.

[9]  C. Levinthal Are there pathways for protein folding , 1968 .

[10]  Fan Jiang,et al.  Prediction of protein secondary structure with a reliability score estimated by local sequence clustering. , 2003, Protein engineering.

[11]  Alan R. Fersht,et al.  Stabilization of protein structure by interaction of α-helix dipole with a charged side chain , 1988, Nature.

[12]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[13]  D Baker,et al.  Limited internal friction in the rate-limiting step of a two-state protein folding reaction. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[14]  M. Swindells,et al.  Intrinsic φ,ψ propensities of amino acids, derived from the coil regions of known structures , 1995, Nature Structural Biology.

[15]  Ken A. Dill,et al.  Folding Very Short Peptides Using Molecular Dynamics , 2006, PLoS Comput. Biol..

[16]  David S. Latchman,et al.  Biochemistry (4th edn) , 1995 .

[17]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[18]  Sheldon Park,et al.  Advances in computational protein design. , 2004, Current opinion in structural biology.

[19]  P. Flory Principles of polymer chemistry , 1953 .

[20]  Maria Sabaye Moghaddam,et al.  Temperature dependence of three-body hydrophobic interactions: potential of mean force, enthalpy, entropy, heat capacity, and nonadditivity. , 2005 .

[21]  Giovanni Soda,et al.  Exploiting the past and the future in protein secondary structure prediction , 1999, Bioinform..

[22]  L. H. Bradley,et al.  Protein design by binary patterning of polar and nonpolar amino acids. , 1993, Methods in molecular biology.

[23]  David Baker,et al.  Exploring folding free energy landscapes using computational protein design. , 2004, Current opinion in structural biology.