Fractal Analysis of Measure Representation of Large Proteins Based on the Detailed HP Model

The notion of measure representation of protein sequences is introduced based on the detailed HP model. Multifractal analysis and detrended fluctuation analysis are then performed on the measure representations of a large number of long protein sequences. It is concluded that these protein sequences are not completely random sequences through the measure representations and the values of the Dq spectra and related Cq curves. The values of the exponent from the detrended fluctuation analysis show that the K-strings with the ordering in the measure representation exhibit strong long-range correlation. For substrings with length K=5, the Dq spectra of all proteins studied are multifractal-like and sufficiently smooth for the Cq curves to be meaningful. The Cq curves of all proteins resemble a classical phase transition at a critical point. An IFS model is found to simulate the measure representation of protein sequences very well. From the estimated values of parameters in the IFS model, we think the non-polar residues and uncharged polar residues play a more important role than other kinds of residues in the protein folding process.

[1]  Stanley,et al.  Phase transition in the multifractal spectrum of diffusion-limited aggregation. , 1988, Physical review letters.

[2]  I. H. Öğüş,et al.  NATO ASI Series , 1997 .

[3]  David M. Webster,et al.  Protein structure prediction : methods and protocols , 2000 .

[4]  Bin Wang,et al.  One way to characterize the compact structures of lattice protein model , 2000 .

[5]  Dewey,et al.  Multifractal analysis of solvent accessibilities in proteins. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[6]  D. A. Lidar,et al.  FRACTAL ANALYSIS OF PROTEIN POTENTIAL ENERGY LANDSCAPES , 1999 .

[7]  K. Dill Theory for the folding and stability of globular proteins. , 1985, Biochemistry.

[8]  Enrique Canessa,et al.  MULTIFRACTALITY IN TIME SERIES , 2000, cond-mat/0004170.

[9]  C. Chothia One thousand families for the molecular biologist , 1992, Nature.

[10]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[11]  C. Peng,et al.  Long-range correlations in nucleotide sequences , 1992, Nature.

[12]  K. Lau,et al.  Recognition of an organism from fragments of its complete genome. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  N. Wingreen,et al.  Emergence of Preferred Structures in a Simple Model of Protein Folding , 1996, Science.

[14]  Boris A. Fedorov,et al.  An analysis of the fractal properties of the surfaces of globular proteins , 1993 .

[15]  S. Dubuc,et al.  Fractal Geometry and Analysis , 1991 .

[16]  Strait,et al.  Multifractals and decoded walks: Applications to protein sequence correlations. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[17]  Itamar Procaccia,et al.  Phase transitions in the thermodynamic formalism of multifractals. , 1987 .

[18]  P. Pfeifer,et al.  Fractal surface dimension of proteins: Lysozyme , 1985 .

[19]  K. Lau,et al.  Measure representation and multifractal analysis of complete genomes. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Jensen,et al.  Erratum: Fractal measures and their singularities: The characterization of strange sets , 1986, Physical review. A, General physics.

[21]  Benoit B. Mandelbrot,et al.  Fractal Geometry of Nature , 1984 .

[22]  V S Pande,et al.  Nonrandomness in protein sequences: evidence for a physically driven stage of evolution? , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Jensen,et al.  Order parameter, symmetry breaking, and phase transitions in the description of multifractal sets. , 1987, Physical review. A, General physics.

[24]  C T Shih,et al.  Mean-field HP model, designability and alpha-helices in protein structures. , 2000, Physical review letters.

[25]  T. Gregory Dewey,et al.  Protein structure and polymer collapse , 1993 .

[26]  Zu-Guo Yu,et al.  Chaos game representation of protein sequences based on the detailed HP model and their multifractal and correlation analyses. , 2004, Journal of theoretical biology.

[27]  M. Lewis,et al.  Fractal surfaces of proteins. , 1985, Science.

[28]  M. Barnsley,et al.  Iterated function systems and the global construction of fractals , 1985, Proceedings of the Royal Society of London. A. Mathematical and Physical Sciences.

[29]  P. Grassberger,et al.  Characterization of Strange Attractors , 1983 .

[30]  Zu-Guo Yu,et al.  Multifractal Characterization of Complete Genomes , 2001 .

[31]  C T Shih,et al.  Geometric and statistical properties of the mean-field hydrophobic-polar model, the large-small model, and real protein sequences. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  B. Wang,et al.  Correlation property of length sequences based on global structure of the complete genome. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  W Wang,et al.  Modeling study on the validity of a possibly simplified representation of proteins. , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[34]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[35]  Flavio Seno,et al.  Steric Constraints in Model Proteins , 1998 .