Time-frequency based biological sequence querying

We investigate the use of time-frequency (TF) methods to query biological sequences in search of regions of similarity or critical relationships among the sequences. Existing querying approaches are insensitive to repeats, especially in low-complexity regions, and do not provide much support for efficiently querying sub-sequences with inserts and deletes (or gaps). Our approach uses highly-localized basis functions and multiple transformations in the TF plane to map characters in a sequence as well as different properties of a sub-sequence, such as its position in the sequence or number of gaps between sub-sequences. We analyze gapped query-based alignment methods using transformations in the TF plane while demonstrating the method's possible operation in real-time without pre-processing. The algorithm's performance is compared to the widely-accepted BLAST alignment approach, and a significance improvement is observed for queries with repetitive segments.

[1]  Alan L Rockwood,et al.  Sequence alignment by cross-correlation. , 2005, Journal of biomolecular techniques : JBT.

[2]  T. Strohmer,et al.  Gabor Analysis and Algorithms: Theory and Applications , 1997 .

[3]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[4]  A. Papandreou-Suppappola,et al.  DNA sequence alignment using the matching pursuit decomposition , 2008, 2008 IEEE International Workshop on Genomic Signal Processing and Statistics.

[5]  Jignesh M. Patel,et al.  OASIS: An Online and Accurate Technique for Local-alignment Searches on Biological Sequences , 2003, VLDB.

[6]  Joseph Felsenstein,et al.  An efficient method for matching nucleic acid sequences , 1982, Nucleic Acids Res..

[7]  Siu-Ming Yiu,et al.  Compressed indexing and local alignment of DNA , 2008, Bioinform..

[8]  E. A. Cheever,et al.  Using signal processing techniques for DNA sequence comparison , 1989, Proceedings of the Fifteenth Annual Northeast Bioengineering Conference.

[9]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[10]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[11]  Douglas L. Jones,et al.  Shear madness: new orthonormal bases and frames using chirp functions , 1993, IEEE Trans. Signal Process..

[12]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[13]  Andrzej K. Brodzik A comparative study of cross-correlation methods for alignment of DNA sequences containing repetitive patterns , 2005, 2005 13th European Signal Processing Conference.

[14]  Panos M. Pardalos,et al.  Efficient Algorithms for Local Alignment Search , 2001, J. Comb. Optim..

[15]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..