Spectral analysis of numerical exon and intron sequences

Analysis of DNA sequences requires conversion of a base sequence to a numerical sequence. The choice of the numerical mapping of a DNA sequence affects how well its biological properties can be reflected in the numerical domain for the detection of regions of interest. This paper presents twelve one-sequence numerical representation methods and a discrete Fourier transform (DFT) based approach to extract the period-3 value of DNA sequences for classifying exon and intron sequences. Simulations are carried out using short sequences of the human genome; the results obtained indicate that the threshold value determined and consequently the classification performance is a function of the adopted numerical representation method and its window length.