An efficient distributed bioinformatics computing system for DNA sequence analysis on encoding system

This paper provides an effective design of search technique of a distributed bioinformatics computing system for analysis of DNA sequences using OPTSDNA algorithm. This system could be used for disease detection, criminal forensic analysis, gene prediction, genetic system and protein analysis. Different types of distributed algorithms for the search and identification for DNA segments and repeat pattern in a given DNA sequence are developed. The search algorithm was developed to compute the number of DNA sequence which contains the same consecutive types of DNA segments. A distributed subsequence identifications algorithm was designed and implemented to detect the segment containing DNA sequences. Sequential and distributed implementation of these algorithms was executed with different length of search segments patterns and genetic sequences. OPTSDNA algorithm is used for storing various sizes of DNA sequence into database. DNA sequences of different lengths were tested by using this algorithm. These input DNA sequences varied in size from very small to very large. The performance of search technique distributed system is compared with sequential approach

[1]  R. Kumar,et al.  A distributed bioinformatics computing system for analysis of DNA sequences , 2007, Proceedings 2007 IEEE SoutheastCon.

[2]  Sanguthevar Rajasekaran,et al.  Parallel pattern identification in biological sequences on clusters , 2003 .

[3]  Iqbal H. Sarker,et al.  Algorithm for Optimal Storage of a Distributed Bioinformatics System for Analysis of DNA Sequences , 2013 .

[4]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[5]  Andreas D. Baxevanis,et al.  Bioinformatics - a practical guide to the analysis of genes and proteins , 2001, Methods of biochemical analysis.

[6]  Mohammad Ibrahim Khan,et al.  OPTSDNA: Performance evaluation of an efficient distributed bioinformatics system for DNA sequence analysis , 2013, Bioinformation.

[7]  Evangelos Petroutsos Mastering Visual Basic .NET , 2001 .

[8]  Volker Strumpen Coupling hundreds of workstations for parallel molecular sequence analysis , 1995, Softw. Pract. Exp..

[9]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[10]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[11]  Chintalapati Janaki,et al.  Accelerating comparative genomics using parallel computing , 2003, Silico Biol..

[12]  Ross Mistry Microsoft SQL Server 2008 Management and Administration , 2008 .

[13]  Uzi Vishkin,et al.  Optimal Parallel Pattern Matching in Strings , 2017, Inf. Control..

[14]  Chun-Hsi Huang,et al.  Parallel pattern identification in biological sequences on clusters , 2002, IEEE Transactions on NanoBioscience.