A Protein Folding Degree Measure and Its Dependence on Crystal Packing, Protein Size, Secondary Structure, and Domain Structural Class

Comparing two or more protein structures with respect to their degree of folding is common practice in structural biology despite the fact that there is no scale for a folding degree. Here we introduce a formal definition of a folding degree, capable of quantitative characterization. This enables ordering among protein chains based on their degree of folding. The folding degree of a data set of 152 representative nonhomologous proteins is then studied. We demonstrate that the variation in the folding degree seen for this data set is not due to crystallization artifacts or experimental conditions, such as resolution, refinement protocol, pH, or temperature. A good linear relationship is observed between the folding degree and the percentages of secondary structures in the protein. The folding degree is able to account for the small changes produced in the structure due to crystal packing and temperature. Automating the classification of proteins into their respective structural domain classes, namely mainly-alpha, mainly-beta, and alpha-beta, is also possible.

[1]  U Heinemann,et al.  High-throughput three-dimensional protein structure determination. , 2001, Current opinion in biotechnology.

[2]  Ernesto Estrada,et al.  Spectral Moments of the Edge-Adjacency Matrix of Molecular Graphs, 2. Molecules Containing Heteroatoms and QSAR Applications , 1997, J. Chem. Inf. Comput. Sci..

[3]  K. Dill Dominant forces in protein folding. , 1990, Biochemistry.

[4]  Stanley C. Eisenstat,et al.  A Divide-and-Conquer Algorithm for the Symmetric Tridiagonal Eigenproblem , 1995, SIAM J. Matrix Anal. Appl..

[5]  Michel A. Hofman,et al.  Size and Shape of the Cerebral Cortex in Mammals (Part 1 of 2) , 1985 .

[6]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[7]  Oliviero Carugo,et al.  Protein—protein crystal‐packing contacts , 1997, Protein science : a publication of the Protein Society.

[8]  J M Thornton,et al.  Analysis of domain structural class using an automated class assignment protocol. , 1996, Journal of molecular biology.

[9]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[10]  K A Dill,et al.  Are proteins well-packed? , 2001, Biophysical journal.

[11]  Ernesto Estrada Generalized Spectral Moments of the Iterated Line Graphs Sequence. A Novel Approach to QSPR Studies , 1999, J. Chem. Inf. Comput. Sci..

[12]  Yawen Bai,et al.  Relationship between the native-state hydrogen exchange and folding pathways of a four-helix bundle protein. , 2002, Biochemistry.

[13]  Gaetano T. Montelione,et al.  Structural genomics: An approach to the protein folding problem , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Z. Xiang,et al.  On the role of the crystal environment in determining protein side-chain conformations. , 2002, Journal of molecular biology.

[15]  David J. Hawkes,et al.  Measures of folding applied to the development of the human fetal brain , 2002, IEEE Transactions on Medical Imaging.

[16]  B. Finzel Incorporation of fast Fourier transforms to speed restrained least‐squares refinement of protein structures , 1987 .

[17]  Michael A. Soss,et al.  Geometric and computational aspects of polymer reconfiguration , 2000 .

[18]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.

[19]  Ernesto Estrada Characterization of 3D molecular structure , 2000 .

[20]  E V Koonin,et al.  Estimating the number of protein folds and families from complete genome data. , 2000, Journal of molecular biology.

[21]  Michele Vendruscolo,et al.  Protein folding and misfolding: a paradigm of self–assembly and regulation in complex biological systems , 2003, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[22]  Goran Krilov,et al.  Characterization of 3-D sequences of proteins☆ , 1997 .

[23]  Tim J. P. Hubbard,et al.  SCOP database in 2002: refinements accommodate structural genomics , 2002, Nucleic Acids Res..

[24]  G. Petsko,et al.  Effects of temperature on protein structure and dynamics: X-ray crystallographic studies of the protein ribonuclease-A at nine different temperatures from 98 to 320 K. , 1993, Biochemistry.

[25]  Aleksey Vishnyakov and,et al.  Molecular Simulation Study of Nafion Membrane Solvation in Water and Methanol , 2000 .

[26]  K. B. Ward,et al.  Occluded molecular surface: Analysis of protein packing , 1995, Journal of molecular recognition : JMR.

[27]  Privalov Pl,et al.  Thermodynamic Problems of Protein Structure , 1989 .

[28]  D. S. Moss,et al.  RESTRAIN: restrained structure-factor least-squares refinement program for macromolecular structures , 1989 .

[29]  P. Di Francesco,et al.  Folding and coloring problems in mathematics and physics , 2000 .

[30]  A. J. Shaka,et al.  Three-stranded mixed artificial β-sheets , 2002 .

[31]  S. Jones,et al.  Principles of protein-protein interactions. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Jaap Heringa,et al.  An analysis of protein domain linkers: their classification and role in protein folding. , 2002, Protein engineering.

[33]  J M Thornton,et al.  Conservation helps to identify biologically relevant crystal contacts. , 2001, Journal of molecular biology.

[34]  J. Nowick,et al.  A Triply Templated Artificial β-Sheet , 2001 .

[35]  A. Markvardsen,et al.  A hybrid Monte Carlo method for crystal structure determination from powder diffraction data. , 2002, Acta crystallographica. Section A, Foundations of crystallography.

[36]  Ernesto Estrada,et al.  Application of a novel graph-theoretic folding degree index to the study of steroid-DB3 antibody binding affinity , 2003, Comput. Biol. Chem..

[37]  S H Bryant,et al.  Extent and nature of contacts between protein molecules in crystal lattices and between subunits of protein oligomers , 1997, Proteins.

[38]  T. Misteli,et al.  Genomes, proteomes, and dynamic networks in the cell nucleus , 2002, Histochemistry and Cell Biology.

[39]  S. Solomon,et al.  Genomes, transcriptomes, and proteomes: molecular medicine and its impact on medical practice. , 2003, Archives of internal medicine.

[40]  Brian W. Matthews,et al.  An efficient general-purpose least-squares refinement program for macromolecular structures , 1987 .

[41]  James E. Bray,et al.  Assigning genomic sequences to CATH , 2000, Nucleic Acids Res..

[42]  Nenad Trinajstic,et al.  Use of Variable Selection in Modeling the Secondary Structural Content of Proteins from Their Composition of Amino Acid Residues , 2004, J. Chem. Inf. Model..

[43]  Goran Krilov,et al.  ON A CHARACTERIZATION OF THE FOLDING OF PROTEINS , 1999 .

[44]  F. Cohen,et al.  The three-dimensional structure of prion protein: implications for prion disease. , 1998, Biochemical Society transactions.

[45]  Ernesto Estrada,et al.  Characterization of the folding degree of proteins , 2002, Bioinform..

[46]  F M Richards,et al.  Protein packing: dependence on protein size, secondary structure and amino acid composition. , 2000, Journal of molecular biology.

[47]  E Estrada,et al.  Modeling chromatographic parameters by a novel graph theoretical sub-structural approach. , 1999, Journal of chromatography. A.

[48]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[49]  Burkhard Rost,et al.  Did evolution leap to create the protein universe? , 2002, Current opinion in structural biology.

[50]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.