A comparative analysis of smith-waterman based partial alignment

Finding large deletions in genome sequences have become increasingly more useful in bioinformatics, such as in clinical research and diagnosis. Several partial alignment approaches based on the Smith-Waterman (SW) algorithm has been proposed for alignment with large gaps. However, in the literature, no detailed comparisons of these three SW-based methods were given in terms of the runtimes and errors in estimated position of the start of the deletion in the query sequences. Our comparative simulations show that BinaryPartialAlign has the lowest error and very high speed.

[1]  Michael A. Muratet Comparing the Speed and Accuracy of the Smith and Waterman Algorithm as Implemented by Mpsrch with the Blast and Fasta Heuristics for Sequence Similarity Searching , 2002, TheScientificWorldJournal.

[2]  L. Shaffer,et al.  Chromosome Abnormalities and Genetic Counseling , 1989 .

[3]  D. Ustek,et al.  A Novel Partial Sequence Alignment Tool for Finding Large Deletions , 2012, TheScientificWorldJournal.

[4]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[5]  Thomas H. Cormen,et al.  Introduction to algorithms [2nd ed.] , 2001 .

[6]  D. Conrad,et al.  A high-resolution survey of deletion polymorphism in the human genome , 2006, Nature Genetics.

[7]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[8]  Dmitri A Petrov,et al.  Mutational equilibrium model of genome size evolution. , 2002, Theoretical population biology.

[9]  R. Schmickel Contiguous gene syndromes: a component of recognizable syndromes. , 1986, The Journal of pediatrics.

[10]  M V Olson,et al.  When less is more: gene loss as an engine of evolutionary change. , 1999, American journal of human genetics.

[11]  Kai Ye,et al.  Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads , 2009, Bioinform..

[12]  Torbjørn Rognes,et al.  Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation , 2011, BMC Bioinformatics.

[13]  Ellen M Wijsman,et al.  Presence of large deletions in kindreds with autism. , 2002, American journal of human genetics.