Search of periodicities in primary structure of biopolymers: a general Fourier approach

We discuss a new convenient way to study periodical patterns in primary structures of biopolymers which appeared recently. For the sequence of a biopolymer the symbolic correlation function is constructed, which is used as a digital sequence thus allowing us to perform a Fourier transform. Another fruitful technical improvement is the closing of the sequence in the ring with further scanning of the ring length, which allows the study of periods of the order of the sequence length. This approach makes it possible to take into account any scores describing similarity between symbols and to compare results obtained using different Fourier-like and correlation matrix techniques. An algorithm to compute Fourier spectrum power allows detection of vague periods in sequences containing strong repeats. A PASCAL program, SYMFOUR, has been written and tested on both sequences with periodical patterns, already reported, and sequences and other sites interesting from a biological point of view.

[1]  Mikhail S. Gelfand Computer Functional Analysis of Nucleotide Sequences: Problems and Approaches , 1992, Mathematical Methods Of Analysis Of Biopolymer Sequences.

[2]  D A Parry,et al.  Analysis of the primary structure of collagen for the origins of molecular packing. , 1973, Journal of molecular biology.

[3]  David C. Torney,et al.  Repetitive DNA Sequences: Some Considerations for Simple Sequence Repeats , 1993, Comput. Chem..

[4]  A. Mclachlan,et al.  The 14-fold periodicity in α-tropomyosin and the interaction with actin , 1976 .

[5]  A. D. Mclachlan Multichannel Fourier analysis of patterns in protein sequences , 1993 .

[6]  H. Hofmann,et al.  Comparative analysis of the sequences of the three collagen chains α1(I), α2 and α1(III): Functional and genetic aspects , 1980 .

[7]  D C Benson Fourier methods for biosequence analysis. , 1990, Nucleic acids research.

[8]  R. Linsker,et al.  A measure of DNA periodicity. , 1986, Journal of theoretical biology.

[9]  R. Hynes,et al.  Analysis of repeated motifs in the talin rod. , 1994, Journal of molecular biology.

[10]  M. Waterman Mathematical Methods for DNA Sequences , 1989 .

[11]  The third nucleotide of the Gly coding triplet remembers the periodicity of the collagen chain , 1995, FEBS letters.

[12]  S. McKnight,et al.  The leucine zipper: a hypothetical structure common to a new class of DNA binding proteins. , 1988, Science.

[13]  C. Peng,et al.  Long-range correlations in nucleotide sequences , 1992, Nature.

[14]  J. Deisenhofer,et al.  The leucine-rich repeat: a versatile binding motif. , 1994, Trends in biochemical sciences.