The Language Metaphor in Sequence Analysis

Abstract The metaphors of language and coding have provided a powerful framework for organizing molecular biology. Many techniques developed in the analysis of text and other communication channels hasve been successfully applied to macromolecular sequences with little or no change. However, a number of properties, such as long-range interactions, structural dynamics and the importance of sequence variation in modulation of function, are poorly modelled by the language metaphor. It is not necessary to abandon the power of the language metaphor, but it is suggested that a conscious effort to set aside this compelling metaphor and examine other viewpoints is likely to be useful.

[1]  P Bucher,et al.  Compilation and analysis of eukaryotic POL II promoter sequences. , 1986, Nucleic acids research.

[2]  W Gilbert,et al.  Genes-in-pieces revisited. , 1985, Science.

[3]  S. Altschul Amino acid substitution matrices from an information theoretic perspective , 1991, Journal of Molecular Biology.

[4]  T D Schneider,et al.  Excess information at bacteriophage T7 genomic promoters detected by a random cloning technique. , 1989, Nucleic acids research.

[5]  W. McClure,et al.  Mechanism and control of transcription initiation in prokaryotes. , 1985, Annual review of biochemistry.

[6]  Rodger Staden,et al.  Measurements of the effects that coding for a protein has on a DNA sequence and their use for finding genes , 1984, Nucleic Acids Res..

[7]  P Chambon,et al.  Organization and expression of eucaryotic split genes coding for proteins. , 1981, Annual review of biochemistry.

[8]  M. Gribskov,et al.  The codon preference plot: graphic analysis of protein coding sequences and prediction of gene expression , 1984, Nucleic Acids Res..

[9]  P. Bucher Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. , 1990, Journal of molecular biology.

[10]  J. Piatigorsky,et al.  Lens crystallins: the evolution and expression of proteins for a highly specialized tissue. , 1988, Annual review of biochemistry.

[11]  G. Stormo Consensus patterns in DNA. , 1990, Methods in enzymology.

[12]  M. Go Correlation of DNA exonic regions with protein structural units in haemoglobin , 1981, Nature.

[13]  A. D. McLachlan,et al.  Codon preference and its use in identifying protein coding regions in long DNA sequences , 1982, Nucleic Acids Res..

[14]  P Argos,et al.  Analysis of sequence-similar pentapeptides in unrelated protein tertiary structures. Strategies for protein folding and a guide for site-directed mutagenesis. , 1987, Journal of molecular biology.

[15]  T Gojobori,et al.  Codon usage tabulated from the GenBank Genetic Sequence Data. , 1988, Nucleic acids research.

[16]  Patrick Argos,et al.  The Language of Protein Folding: Many Forked Tongues , 1992, Comput. Chem..

[17]  T. D. Schneider,et al.  Information content of binding sites on nucleotide sequences. , 1986, Journal of molecular biology.

[18]  W. Gilbert Why genes in pieces? , 1978, Nature.

[19]  Richard H. Lathrop,et al.  ARIADNE: pattern-directed inference and hierarchical abstraction in protein structure recognition , 1987, CACM.