Analysis of threshold influence on the accuracy of gene-prediction methods based on power spectrum analysis

The accuracy of methods based on power spectrum analysis depends on the threshold that is used to discriminate the coding and non-coding sequences. Due to gene structural differences of different organisms, we inferred that there is an optimal gene prediction threshold for each organism. To prove this, we analyzed real biological data, and found that there are indeed different optimal thresholds for different organisms when the methods based on power spectrum analysis are used to predict genes.

[1]  Alan K. Mackworth,et al.  Improving gene recognition accuracy by combining predictions from two gene-finding programs , 2002, Bioinform..

[2]  Hao Huang,et al.  An efficient sliding window strategy for accurate location of eukaryotic protein coding regions , 2009, Comput. Biol. Medicine.

[3]  P. Rouzé,et al.  Current methods of gene prediction, their strengths and weaknesses. , 2002, Nucleic acids research.

[4]  Yizhar Lavner,et al.  Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions. , 2003, Genome research.

[5]  Jamal Tuqan,et al.  A DSP Approach for Finding the Codon Bias in DNA Sequences , 2008, IEEE Journal of Selected Topics in Signal Processing.

[6]  Changchuan Yin,et al.  Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence. , 2007, Journal of theoretical biology.

[7]  Mahmood Akhtar,et al.  Signal Processing in Sequence Analysis: Advances in Eukaryotic Gene Prediction , 2008, IEEE Journal of Selected Topics in Signal Processing.

[8]  P.D. Cristea,et al.  Genomic signal processing , 2004, 7th Seminar on Neural Network Applications in Electrical Engineering, 2004. NEUREL 2004. 2004.

[9]  Dimitris Anastassiou,et al.  Frequency-domain analysis of biomolecular sequences , 2000, Bioinform..

[10]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[11]  S. Tiwari,et al.  Prediction of probable genes by Fourier analysis of genomic sequences , 1997, Comput. Appl. Biosci..

[12]  Brian Kinghorn,et al.  Periodicity of DNA in exons , 2004, BMC Molecular Biology.

[13]  Gajendra P.S. Raghava,et al.  EGPred: prediction of eukaryotic genes using ab initio methods after combining with sequence similarity approaches. , 2004, Genome research.

[14]  P. Welch The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms , 1967 .