Parser for protein folding units

General patterns of protein structural organization have emerged from studies of hundreds of structures elucidated by X‐ray crystallography and nuclear magnetic resonance. Structural units are commonly identified by visual inspection of molecular models using qualitative criteria. Here, we propose an algorithm for identification of structural units by objective, quantitative criteria based on atomic interactions. The underlying physical concept is maximal interactions within each unit and minimal interaction between units (domains). In a simple harmonic approximation, interdomain dynamics is determined by the strength of the interface and the distribution of masses. The most likely domain decomposition involves units with the most correlated motion, or largest interdomain fluctuation time. The decomposition of a convoluted 3‐D structure is complicated by the possibility that the chain can cross over several times between units. Grouping the residues by solving an eigenvalue problem for the contact matrix reduces the problem to a one‐dimensional search for all reasonable trial bisections. Recursive bisection yields a tree of putative folding units. Simple physical criteria are used to identify units that could exist by themselves. The units so defined closely correspond to crystallographers' notion of structural domains. The results are useful for the analysis of folding principles, for modular protein design and for protein engineering. © 1994 Wiley‐Liss, Inc.

[1]  C. Sander,et al.  Searching protein structure databases has come of age , 1994, Proteins.

[2]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[3]  T. Noguti,et al.  Localization of hydrogen‐bonds within modules in barnase , 1993, Proteins.

[4]  M. H. Zehfus,et al.  Improved calculations of compactness and a reevaluation of continuous compact units , 1993, Proteins.

[5]  R. L. Baldwin,et al.  The molten globule intermediate of apomyoglobin and the process of protein folding , 1993, Protein Science.

[6]  G. Rose,et al.  Protein folding--what's the question? , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[7]  C. Sander,et al.  Fast and simple monte carlo algorithm for side chain optimization in proteins: Application to model building by homology , 1992, Proteins.

[8]  K. P. Murphy,et al.  Molecular basis of co-operativity in protein folding. III. Structural identification of cooperative folding units and folding intermediates. , 1992, Journal of molecular biology.

[9]  T. Creighton,et al.  The partially folded conformation of the Cys-30 Cys-51 intermediate in the disulfide folding pathway of bovine pancreatic trypsin inhibitor. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Peer Bork,et al.  Mobile modules and motifs , 1992, Current Biology.

[11]  L Serrano,et al.  The folding of an enzyme. II. Substructure of barnase and the contribution of different interactions to protein stability. , 1992, Journal of molecular biology.

[12]  L Serrano,et al.  The folding of an enzyme. VI. The folding pathway of barnase: comparison with theoretical models. , 1992, Journal of molecular biology.

[13]  H. Roder,et al.  Early hydrogen-bonding events in the folding reaction of ubiquitin. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[14]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[15]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[16]  D. Blow,et al.  Crystal structure of cholesterol oxidase from Brevibacterium sterolicum refined at 1.8 A resolution. , 1991, Journal of molecular biology.

[17]  J Moult,et al.  An analysis of protein folding pathways. , 1991, Biochemistry.

[18]  C. Sander,et al.  Database algorithm for generating protein backbone and side-chain co-ordinates from a C alpha trace application to model building and detection of co-ordinate errors. , 1991, Journal of molecular biology.

[19]  S W Englander,et al.  Structural description of acid-denatured cytochrome c by hydrogen exchange and 2D NMR. , 1990, Biochemistry.

[20]  J. Yon,et al.  Unfolding-refolding of the domains in yeast phosphoglycerate kinase: comparison with the isolated engineered domains. , 1990, Biochemistry.

[21]  W. Kabsch,et al.  Atomic structure of the actin: DNase I complex , 1990, Nature.

[22]  F. Richards,et al.  Identification of regions of potential flexibility in protein structures: Folding units and correlations with intron positions , 1988, Biopolymers.

[23]  R. J. Corbett,et al.  Independent folding of autolytic fragments of thermolysin and their domain-like properties. , 2009, International journal of peptide and protein research.

[24]  G. Rose,et al.  Compact units in proteins. , 1986, Biochemistry.

[25]  L Holm,et al.  Codon usage and gene expression. , 1986, Nucleic acids research.

[26]  M. Levitt,et al.  Protein normal-mode dynamics: trypsin inhibitor, crambin, ribonuclease and lysozyme. , 1985, Journal of molecular biology.

[27]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[28]  N. Go,et al.  Dynamics of a small globular protein in terms of low-frequency vibrational modes. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Shoshana J. Wodak,et al.  Location of structural domains in proteins , 1981 .

[30]  A M Lesk,et al.  Folding units in globular proteins. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Alexander A. Rashin,et al.  Location of domains in globular proteins , 1981, Nature.

[32]  Christian Sander Physical criteria for folding units of globular proteins , 1981 .

[33]  C. Ghélis Transient conformational states in proteins followed by differential labeling. , 1980, Biophysical journal.

[34]  H A Scheraga,et al.  A possible folding pathway of bovine pancreatic RNase. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[35]  G. Rose,et al.  Hierarchic organization of domains in globular proteins. , 1979, Journal of molecular biology.

[36]  W. Hol,et al.  Crystal structure of p-hydroxybenzoate hydroxylase. , 1979, Journal of molecular biology.

[37]  G M Crippen,et al.  The tree structural organization of proteins. , 1978, Journal of molecular biology.

[38]  G. Schulz,et al.  The structure of the flavoenzyme glutathione reductase , 1978, Nature.

[39]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[40]  M. Karplus,et al.  The hinge-bending mode in lysozyme , 1976, Nature.

[41]  M. Hill Correspondence Analysis: A Neglected Multivariate Method , 1974 .

[42]  M. Hill,et al.  Reciprocal Averaging : an eigenvector method of ordination , 1973 .