DNA Sequence Representation and Comparison Based on Quaternion Number System

Conventional schemes for DNA sequence representation, storage, and processing areusually developed based on the character-based formats.We propose the quaternion number system for numerical representation and further processing on DNA sequences.In the proposed method, the quaternion cross-correlation operation can be used to obtain both the global and local matching/mismatching information between two DNA sequences from the depicted one-dimensional curve and two-dimensional pattern, respectively.Simulation results on various DNA sequences and the comparison result with the well-known BLAST method are obtained to verify the effectiveness of the proposed method.

[1]  A. W. M. van den Enden,et al.  Discrete Time Signal Processing , 1989 .

[2]  Thomas Bülow,et al.  Hypercomplex signals-a novel extension of the analytic signal to the multidimensional case , 2001, IEEE Trans. Signal Process..

[3]  Okan K. Ersoy,et al.  Neural network schemes for detecting rare events in human genomic DNA , 2000, Bioinform..

[4]  S. Tiwari,et al.  Prediction of probable genes by Fourier analysis of genomic sequences , 1997, Comput. Appl. Biosci..

[5]  Sun-Yuan Hsieh,et al.  A DNA-based solution to the graph isomorphism problem using Adleman-Lipton model with stickers , 2008, Appl. Math. Comput..

[6]  Paul Dan Cristea,et al.  Large scale features in DNA genomic signals , 2003, Signal Process..

[7]  A. S. Solodovnikov,et al.  Hypercomplex Numbers: An Elementary Introduction to Algebras , 1989 .

[8]  P.D. Cristea,et al.  Genomic signal processing , 2004, 7th Seminar on Neural Network Applications in Electrical Engineering, 2004. NEUREL 2004. 2004.

[9]  Soo-Chang Pei,et al.  Efficient implementation of quaternion Fourier transform, convolution, and correlation by 2-D complex FFT , 2001, IEEE Trans. Signal Process..

[10]  Hsuan T. Chang DNA Sequence Visualization , 2008 .

[11]  J. Shu,et al.  HYPERCOMPLEX CROSS-CORRELATION OF DNA SEQUENCES , 2010, 1402.5341.

[12]  Stephen J. Sangwine,et al.  Colour image filters based on hypercomplex convolution , 2000 .

[13]  Sun-Yuan Hsieh,et al.  A DNA-based graph encoding scheme with its applications to graph isomorphism problems , 2008, Appl. Math. Comput..

[14]  C. H. Li,et al.  Global Visualizaiton and Comparision of DNA Sequences by Use of Three-Dimensional Trajectories , 2007, J. Inf. Sci. Eng..

[15]  Wei Wang,et al.  Computing linear transforms of symbolic signals , 2002, IEEE Trans. Signal Process..

[16]  Jaakko Astola,et al.  Genomic Signal Processing , 2003, IEEE Signal Processing Magazine.

[17]  J. Shu,et al.  Cross-correlation of DNA Sequences Using Hypercomplex Number Encoding , 2006, 2006 International Conference on Biomedical and Pharmaceutical Engineering.

[18]  Kazuhiro Ueda,et al.  Digital filters with hypercomplex coefficients , 1993, 1993 IEEE International Symposium on Circuits and Systems.

[19]  Dimitris Anastassiou,et al.  Frequency-domain analysis of biomolecular sequences , 2000, Bioinform..

[20]  Andrew K. C. Wong,et al.  A genetic algorithm for multiple molecular sequence alignment , 1997, Comput. Appl. Biosci..

[21]  Limsoon Wong,et al.  Accomplishments and challenges in literature data mining for biology , 2002, Bioinform..

[22]  Jian-Jun Shu,et al.  Pairwise alignment of the DNA sequence using hypercomplex number representation , 2004, Bulletin of mathematical biology.

[23]  Stephen J. Sangwine,et al.  Hypercomplex Wiener-Khintchine theorem with application to color image correlation , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[24]  Hui-Huang Hsu,et al.  Advanced Data Mining Technologies in Bioinformatics , 2006 .

[25]  Y. Magarshak Quaternion representation of RNA sequences and tertiary structures. , 1993, Bio Systems.