Preliminary wavelet analysis of genomic sequences

Large genome-sequencing projects have made urgent the development of accurate methods for annotation of DNA sequences. Existing methods combine ab inito pattern searches with knowledge gathered from comparison with sequence databases or from training sets of known genes. However, the accuracy of these methods is still far from satisfactory. In the present study, wavelet algorithms in combination with entropy method are being developed as an alternative way to determine gene locations in genomic DNA sequences. Wavelet methods seek periodicity present in sequences. A promising advantage of wavelets is their adaptivity to varying lengths of coding/noncoding regions. Moreover, the wavelet methods integrated with entropy method just search the information contents of the sequences, which do not need to be trained. The preliminary results show that the wavelet approach is feasible and may be better than some knowledge-dependent approaches based on a sample of genomic DNA sequences.