ExSep: An exon separation process using Neural Skyline Filter

Exons and Introns are complimentary parts of DNA and RNA. Due to excessive data set in biological science, it is sometimes very expensive and costly to extract meaningful information from such data set. To accelerate efficient and faster exons separation an automated system designed under Neural Skyline Filter(NeuralSF) and Bloom filter. This development allows the comparative analysis on performances among NeuralSF, Bloom Filter and processing without filter. The outcome of the experiments and simulations shows that NeuralSF outperforms other processes in both the cases as number of exons finding and timing. This system may help to reduce the redundant data set from large number of collections. Apart from that it will enable to handle big biological data.

[1]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[2]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[3]  Juan V. Lorenzo-Ginori,et al.  Digital Signal Processing in the Analysis of Genomic Sequences , 2009 .

[4]  P. Rouzé,et al.  Current methods of gene prediction, their strengths and weaknesses. , 2002, Nucleic acids research.

[5]  S. Tiwari,et al.  Prediction of probable genes by Fourier analysis of genomic sequences , 1997, Comput. Appl. Biosci..

[6]  J. Fickett Recognition of protein coding regions in DNA sequences. , 1982, Nucleic acids research.

[7]  Leonidas D. Iasemidis,et al.  Autoregressive Modeling and Feature Analysis of DNA Sequences , 2004, EURASIP J. Adv. Signal Process..

[8]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[9]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[11]  Mohammad Ibrahim Khan,et al.  Memory Optimization for Global Protein Network Alignment Using Pushdown Automata and De Bruijn Graph , 2014, J. Softw..

[12]  Li Fan,et al.  Summary cache: a scalable wide-area Web cache sharing protocol , 1998, SIGCOMM '98.

[13]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[14]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[15]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[16]  Hua Lu,et al.  Parallel Distributed Processing of Constrained Skyline Queries by Filtering , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[17]  Hong Yan,et al.  Short Exon Detection in DNA Sequences Based on Multifeature Spectral Analysis , 2011, EURASIP J. Adv. Signal Process..