Disease Diagnosis Using Pattern Matching Algorithm from DNA Sequencing: a Sequential and GPGPU based Approach

DNA sequencing is one of the most important areas of research today. DNA sequencing is used in a variety of areas like forensic science, agriculture, medical field etc. The disease diagnosis from DNA sequencing is one of them which are harmless method to find out chances of disease occurrence. Chemical as well as sequential change in DNA leads to diseases. DNA is a large database; an efficient algorithm is needed to carry out the disease diagnosis. The study proposed sequential as well as GPGPU based multi string pattern matching Aho-corasick algorithm to find out the chances of occurrence of certain nucleotide repeat diseases and some cancer types from different DNA sequences. The results demonstrate that the algorithm works better with GPGPU based parallel approach and gives better speed up when patterns increases. So, with the proposed way the algorithm is well worked for Bioinformatics applications.

[1]  Kwong-Sak Leung,et al.  A fast CUDA implementation of agrep algorithm for approximate nucleotide sequence matching , 2011, 2011 IEEE 9th Symposium on Application Specific Processors (SASP).

[2]  C. McMurray Mechanisms of trinucleotide repeat instability during human development , 2010, Nature Reviews Genetics.

[3]  Stefan Voß,et al.  A hybrid algorithm for the DNA sequencing problem , 2014, Discret. Appl. Math..

[4]  Sandeep U. Mane,et al.  GPGPU based teaching learning based optimization and Artificial bee colony algorithm for unconstrained optimization problems , 2015, 2015 IEEE International Advance Computing Conference (IACC).

[5]  Miguel A. Vega-Rodríguez,et al.  Predicting DNA Motifs by Using Evolutionary Multiobjective Optimization , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[6]  Fayez Gebali,et al.  A fast string search algorithm for deep packet classification , 2004, Comput. Commun..

[7]  Cheng-Hung Lin,et al.  Accelerating Pattern Matching Using a Novel Parallel Algorithm on GPUs , 2013, IEEE Transactions on Computers.

[8]  Antonino Tumeo,et al.  Aho-Corasick String Matching on Shared and Distributed-Memory Parallel Architectures , 2012, IEEE Transactions on Parallel and Distributed Systems.

[9]  Zhen Ji,et al.  DNA Sequence Compression Using Adaptive Particle Swarm Optimization-Based Memetic Algorithm , 2011, IEEE Transactions on Evolutionary Computation.

[10]  D. Lipman,et al.  National Center for Biotechnology Information , 2019, Springer Reference Medizin.

[11]  Snehal P. Adey GPU Accelerated Pattern Matching Algorithm for DNA Sequences to Detect Cancer using CUDA , 2013 .

[12]  Wojciech Rytter,et al.  Text Algorithms , 1994 .

[13]  Sandeep U. Mane,et al.  Overview and Applications of Particle Swarm Optimization on GPGPU , 2014 .