STUDY OF BLAST DNA MATCHING TOOLKITS

The beginning of bioinformatics saw the development of algorithms that enabled the storage of nucleic acid and protein sequences in the form of annotated databases in a manner that would allow researchers to exchange information about gene and protein sequences easily and quickly. Databases are growing extremely fast, hence it is essential to use the current databases, which are easily available on the Web. This tutorial deals with the concept of DNA matching by using BLAST programs such as BLASTN and MEGABLAST to perform similarity sequence search and to evaluate their relative effectiveness. Interpretation of the BLAST results is done. Comparisons between the two algorithms are included based on varying parameters such as word sizes, query sequences length and gap X drop-off values, etc. It is found that as the word size increases, the computation time for both BLASTN and MEGABLAST algorithms decreases. BLASTN is more sensitive than MEGABLAST since it uses a shorter default word size of 11 as compared to MEGABLAST, which uses a default word size of 28. The search strategy offers a tradeoff between speed and sensitivity. As for BLAST 2 Sequences, MEGABLAST could perform better than BLASTN only for large word sizes greater than or equal to 16 and for longer sequences.

[1]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[2]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[3]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[4]  S. Karlin,et al.  Applications and statistics for multiple high-scoring segments in molecular sequences. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Webb Miller,et al.  Comparison of genomic DNA sequences: solved and unsolved problems , 2001, Bioinform..

[6]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[7]  Andreas D. Baxevanis,et al.  Bioinformatics - a practical guide to the analysis of genes and proteins , 2001, Methods of biochemical analysis.

[8]  C. Friedman,et al.  Using BLAST for identifying gene and protein names in journal articles. , 2000, Gene.

[9]  Bin Ma,et al.  PatternHunter: faster and more sensitive homology search , 2002, Bioinform..

[10]  Andy Brass,et al.  Searching DNA databases for similarities to DNA sequences: when is a match significant? , 1998, Bioinform..

[11]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.