Structures, basins, and energies: A deconstruction of the Protein Coil Library

Globular proteins adopt complex folds, composed of organized assemblies of α‐helix and β‐sheet together with irregular regions that interconnect these scaffold elements. Here, we seek to parse the irregular regions into their structural constituents and to rationalize their formative energetics. Toward this end, we dissected the Protein Coil Library, a structural database of protein segments that are neither α‐helix nor β‐strand, extracted from high‐resolution protein structures. The backbone dihedral angles of residues from coil library segments are distributed indiscriminately across the φ,ψ map, but when contoured, seven distinct basins emerge clearly. The structures and energetics associated with the two least‐studied basins are the primary focus of this article. Specifically, the structural motifs associated with these basins were characterized in detail and then assessed in simple simulations designed to capture their energetic determinants. It is found that conformational constraints imposed by excluded volume and hydrogen bonding are sufficient to reproduce the observed ϕ,ψ distributions of these motifs; no additional energy terms are required. These three motifs in conjunction with α‐helices, strands of β‐sheet, canonical β‐turns, and polyproline II conformers comprise ∼90% of all protein structure.

[1]  G. Rose,et al.  Structure and energetics of the hydrogen-bonded backbone in protein folding. , 2008, Annual review of biochemistry.

[2]  E. Milner-White,et al.  Situations of gamma-turns in proteins. Their relation to alpha-helices, beta-sheets and ligand binding sites. , 1990, Journal of molecular biology.

[3]  M. Perutz THE HEMOGLOBIN MOLECULE. , 1964, Scientific American.

[4]  R. Doolittle,et al.  Of urfs and orfs , 1986 .

[5]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[6]  G. N. Ramachandran,et al.  Molecular structure of polyglycine II. , 1966, Biochimica et biophysica acta.

[7]  Lauren L. Perskie,et al.  Physical‐chemical determinants of turn conformations in globular proteins , 2007, Protein science : a publication of the Protein Society.

[8]  S. Arnott,et al.  The structure of poly-L-proline II. , 1968, Acta crystallographica. Section B: Structural crystallography and crystal chemistry.

[9]  Torsten Schwede,et al.  Assessment of CASP7 predictions for template‐based modeling targets , 2007, Proteins.

[10]  Bosco K. Ho,et al.  The Ramachandran plots of glycine and pre-proline , 2005, BMC Structural Biology.

[11]  J. Thornton,et al.  A revised set of potentials for β‐turn formation in proteins , 1994 .

[12]  Abhishek K. Jha,et al.  Helix, sheet, and polyproline II frequencies and strong nearest neighbor effects in a restricted coil library. , 2005, Biochemistry.

[13]  R. Srinivasan,et al.  LINUS: A hierarchic procedure to predict the fold of a protein , 1995, Proteins.

[14]  A. Mirsky,et al.  On the Structure of Native, Denatured, and Coagulated Proteins. , 1936, Proceedings of the National Academy of Sciences of the United States of America.

[15]  J M Thornton,et al.  Analysis of main chain torsion angles in proteins: prediction of NMR coupling constants for native and random coil conformations. , 1996, Journal of molecular biology.

[16]  F. Crick,et al.  Structure of Polyglycine II , 1955, Nature.

[17]  Robert W Woody,et al.  Is polyproline II a major backbone conformation in unfolded proteins? , 2002, Advances in protein chemistry.

[18]  Gregory E Sims,et al.  Protein conformational space in higher order phi-Psi maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Bosco K. Ho,et al.  Revisiting the Ramachandran plot: Hard‐sphere repulsion, electrostatics, and H‐bonding in the α‐helix , 2003, Protein science : a publication of the Protein Society.

[20]  D. Baker,et al.  Prediction of local structure in proteins using a library of sequence-structure motifs. , 1998, Journal of molecular biology.

[21]  R A Goldstein,et al.  Why are some proteins structures so common? , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[22]  L. Serrano Comparison between the phi distribution of the amino acids in the protein database and NMR data indicates that amino acids have various phi propensities in the random coil conformation. , 1995, Journal of molecular biology.

[23]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.

[24]  George D. Rose,et al.  Steric restrictions in protein folding: An α‐helix cannot be followed by a contiguous β‐strand , 2004 .

[25]  G. Rose,et al.  A backbone-based theory of protein folding , 2006, Proceedings of the National Academy of Sciences.

[26]  J. Kendrew,et al.  A Three-Dimensional Model of the Myoglobin Molecule Obtained by X-Ray Analysis , 1958, Nature.

[27]  Haipeng Gong,et al.  Local secondary structure content predicts folding rates for simple, two-state proteins. , 2003, Journal of molecular biology.

[28]  George D. Rose,et al.  Protein Folding: New Twists , 1988, Bio/Technology.

[29]  T. Creamer,et al.  Determinants of the polyproline II helix from modeling studies. , 2002, Advances in protein chemistry.

[30]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.

[31]  George D. Rose,et al.  A protein taxonomy based on secondary structure , 1999, Nature Structural Biology.

[32]  C. Etchebest,et al.  Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks , 2000, Proteins.

[33]  G M Crippen,et al.  The tree structural organization of proteins. , 1978, Journal of molecular biology.

[34]  H. A. Nagarajaram,et al.  Stereochemical punctuation marks in protein structures: glycine and proline containing helix stop signals. , 1998, Journal of molecular biology.

[35]  R. L. Baldwin,et al.  Role of backbone solvation and electrostatics in generating preferred peptide backbone conformations: Distributions of phi , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[36]  C. Pace,et al.  Protein structure, stability and solubility in water and other solvents. , 2004, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[37]  Sung-Hou Kim,et al.  Protein conformational space in higher order-maps , 2005 .

[38]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[39]  G. Rose,et al.  Hydrogen‐bonded turns in proteins: The case for a recount , 2005, Protein science : a publication of the Protein Society.

[40]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[41]  G. Rose,et al.  Secondary structure determines protein topology , 2006, Protein science : a publication of the Protein Society.

[42]  C. Tanford Protein denaturation. , 1968, Advances in protein chemistry.

[43]  G. N. Ramachandran,et al.  Conformation of polypeptides and proteins. , 1968, Advances in protein chemistry.

[44]  C. Venkatachalam Stereochemical criteria for polypeptides and proteins. V. Conformation of a system of three linked peptide units , 1968, Biopolymers.

[45]  A. Lesk,et al.  How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins. , 1980, Journal of molecular biology.

[46]  Shankar Subramaniam,et al.  Protein fragment clustering and canonical local shapes , 2003, Proteins.

[47]  G. Rose,et al.  Sterics and solvation winnow accessible conformational space for unfolded proteins. , 2005, Journal of molecular biology.

[48]  L. Pauling,et al.  Configurations of Polypeptide Chains With Favored Orientations Around Single Bonds: Two New Pleated Sheets. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[49]  M. Swindells,et al.  Intrinsic φ,ψ propensities of amino acids, derived from the coil regions of known structures , 1995, Nature Structural Biology.

[50]  Nicholas C Fitzkee,et al.  The Protein Coil Library: A structural database of nonhelix, nonstrand fragments derived from the PDB , 2005, Proteins.

[51]  J L Sussman,et al.  A 3D building blocks approach to analyzing and predicting structure of proteins , 1989, Proteins.

[52]  Arthur M Lesk,et al.  Contact patterns between helices and strands of sheet define protein folding patterns , 2007, Proteins.

[53]  G. Rose,et al.  Building native protein conformation from highly approximate backbone torsion angles. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[54]  G. Rose,et al.  Reassessing random-coil statistics in unfolded proteins. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[55]  J. Kendrew,et al.  X-ray studies of compounds of biological interest. , 1957, Annual review of biochemistry.

[56]  G. Rose,et al.  Assessing the solvent-dependent surface area of unfolded proteins using an ensemble model , 2008, Proceedings of the National Academy of Sciences.

[57]  P. Bradley,et al.  High-resolution structure prediction and the crystallographic phase problem , 2007, Nature.

[58]  L. Pauling,et al.  The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[59]  J. Thornton,et al.  Influence of proline residues on protein conformation. , 1991, Journal of molecular biology.

[60]  John Orban,et al.  The design and characterization of two proteins with 88% sequence identity but different structure and function , 2007, Proceedings of the National Academy of Sciences.

[61]  D. Baker,et al.  An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. , 2003, Journal of molecular biology.

[62]  G. Rose,et al.  Are proteins made from a limited parts list? , 2005, Trends in biochemical sciences.

[63]  C. Etchebest,et al.  A structural alphabet for local protein structures: Improved prediction methods , 2005, Proteins.

[64]  R. Aurora,et al.  Helix capping , 1998, Protein science : a publication of the Protein Society.

[65]  R F Doolittle,et al.  Similar amino acid sequences revisited. , 1989, Trends in biochemical sciences.

[66]  G. Rose,et al.  Hierarchic organization of domains in globular proteins. , 1979, Journal of molecular biology.

[67]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[68]  R. Srinivasan,et al.  The Flory isolated-pair hypothesis is not valid for polypeptide chains: implications for protein folding. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[69]  Ron Unger,et al.  The importance of short structural motifs in protein structure analysis , 1993, J. Comput. Aided Mol. Des..

[70]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[71]  W. T. ASTBURY,et al.  Structure of Proteins , 1939, Nature.

[72]  G. Rose,et al.  Turns in peptides and proteins. , 1985, Advances in protein chemistry.