On the syntactic structure of protein sequences and the concept of grammar complexity

It is shown that the concepts of grammar complexity and syntactic structure provide a useful mathematical framework for the investigation of some current problems in protein structure. Grammar complexity gives a measure of the degree of aperiodicity of a sequence and also an optimization criterion for evaluating amino acid categorizations. Three systems of amino acid categorization are compared in relation to their value for describing molecular architecture.

[1]  R J Fletterick,et al.  Secondary structure assignment for alpha/beta proteins by a combinatorial approach. , 1983, Biochemistry.

[2]  V. Lim Structural principles of the globular organization of protein chains. A stereochemical theory of globular protein secondary structure. , 1974, Journal of molecular biology.

[3]  Lars Löfgren,et al.  COMPLEXITY OF DESCRIPTIONS OF SYSTEMS: A FOUNDATIONAL STUDY , 1977 .

[4]  P. Y. Chou,et al.  Prediction of protein conformation. , 1974, Biochemistry.

[5]  Noam Chomsky,et al.  On Certain Formal Properties of Grammars , 1959, Inf. Control..

[6]  S. Miyazawa,et al.  Relationship between mutability, polarity and exteriority of amino acid residues in protein evolution. , 2009, International journal of peptide and protein research.

[7]  J. Monod On symmetry and function in biological systems , 1978 .

[8]  William R. Taylor,et al.  Analysis and prediction of protein β-sheet structures by a combinatorial approach , 1980, Nature.

[9]  F E Cohen,et al.  Protein folding: evaluation of some simple rules for the assembly of helices into tertiary structures with myoglobin as an example. , 1979, Journal of molecular biology.

[10]  M. Hasegawa,et al.  The genetic code and the entropy of protein , 1975 .

[11]  P. Sneath Relations between chemical structure and biological activity in peptides. , 1966, Journal of theoretical biology.

[12]  J. C. Kendrew,et al.  Structure and function of haemoglobin: II. Some relations between polypeptide chain configuration and amino acid sequence , 1965 .

[13]  Martin Davis What is a Computation , 1978 .

[14]  Gregory J. Chaitin,et al.  Information-theoretic computation complexity , 1974, IEEE Trans. Inf. Theory.

[15]  L. Pauling,et al.  Evolutionary Divergence and Convergence in Proteins , 1965 .

[16]  F. Papentin On order and complexity. I. General considerations , 1980 .

[17]  Gregory J. Chaitin,et al.  Information-Theoretic Computational Complexity , 1974 .

[18]  P Argos,et al.  The Chou‐Fasman secondary structure prediction method with an extended data base , 1978, FEBS letters.

[19]  L. Levin,et al.  THE COMPLEXITY OF FINITE OBJECTS AND THE DEVELOPMENT OF THE CONCEPTS OF INFORMATION AND RANDOMNESS BY MEANS OF THE THEORY OF ALGORITHMS , 1970 .

[20]  J. M. Zimmerman,et al.  The characterization of amino acid sequences in proteins by statistical methods. , 1968, Journal of theoretical biology.

[21]  V. Lim Algorithms for prediction of α-helical and β-structural regions in globular proteins , 1974 .

[22]  P. Slonimski,et al.  Formal analysis of protein sequences. I. Specific long-range constraints in pair associations of amino acids. , 1967, Journal of theoretical biology.

[23]  A. Lesk,et al.  How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins. , 1980, Journal of molecular biology.

[24]  Abraham Lempel,et al.  On the Complexity of Finite Sequences , 1976, IEEE Trans. Inf. Theory.

[25]  G. A. Miller,et al.  Finitary models of language users , 1963 .

[26]  G. Chaitin Randomness and Mathematical Proof , 1975 .

[27]  W. Ebeling,et al.  On grammars, complexity, and information measures of biological macromolecules , 1980 .

[28]  O. Ptitsyn Invariant features of globin primary structure and coding of their secondary structure , 1974 .

[29]  Frederic M. Richards,et al.  Packing of α-helices: Geometrical constraints and contact areas☆ , 1978 .

[30]  P. Y. Chou,et al.  Empirical predictions of protein conformation. , 1978, Annual review of biochemistry.

[31]  C. D. Barry,et al.  Comparison of predicted and experimentally determined secondary structure of adenyl kinase , 1974, Nature.

[32]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[33]  Gregory J. Chaitin,et al.  On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.

[34]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.

[35]  Georg E. Schulz,et al.  Principles of Protein Structure , 1979 .