A Practical Parameterized Algorithm for Weighted Minimum Letter Flips Model of the Individual Haplotyping Problem

Given a set of DNA sequence fragments of an individual with each base of every fragment attached a confidence value, the weighted minimum letter flips model (WMLF) of the individual haplotyping problem is to infer a pair of haplotypes by flipping a number of bases such that the sum of the confidence values corresponding to the flipped bases is minimized. WMLF is NP-hard. This paper proposes a parameterized exact algorithm for WMLF of time $O(nk_22^{k_2}+mlogm+mk_1)$, where mis the number of fragments, nis the number of SNP sites, k 1 is the maximum number of SNP sites that a fragment covers, and k 2 is the maximum number of fragments that cover a SNP site. Since in real biological experiments, both k 1 and k 2 are small, the parameterized algorithm is efficient in practical application.

[1]  Tom H. Lindner,et al.  Genetic variation in the gene encoding calpain-10 is associated with type 2 diabetes mellitus , 2000, Nature Genetics.

[2]  Rita Casadio,et al.  Algorithms in Bioinformatics, 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings , 2005, WABI.

[3]  Eugene W. Myers,et al.  A Dataset Generator for Whole Genome Shotgun Sequencing , 1999, ISMB.

[4]  Russell Schwartz,et al.  SNPs Problems, Complexity, and Algorithms , 2001, ESA.

[5]  Luonan Chen,et al.  Models and Algorithms for Haplotyping Problem , 2006 .

[6]  Harvey J. Greenberg,et al.  Opportunities for Combinatorial Optimization in Computational Biology , 2004, INFORMS J. Comput..

[7]  Eugene W. Myers,et al.  Comparing Assemblies Using Fragments and Mate-Pairs , 2001, WABI.

[8]  Alessandro Panconesi,et al.  Fast Hare: A Fast Heuristic for Single Individual SNP Haplotype Reconstruction , 2004, WABI.

[9]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[10]  Friedhelm Meyer auf der Heide,et al.  Algorithms — ESA 2001 , 2001, Lecture Notes in Computer Science.

[11]  Xiang-Sun Zhang,et al.  Haplotype reconstruction from SNP fragments by minimum error correction , 2005, Bioinform..

[12]  Geoffrey B. Nilsen,et al.  Whole-Genome Patterns of Common DNA Variation in Three Human Populations , 2005, Science.

[13]  J. Stephens,et al.  Haplotype Variation and Linkage Disequilibrium in 313 Human Genes , 2001, Science.

[14]  Russell Schwartz,et al.  Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem , 2002, Briefings Bioinform..

[15]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[16]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[17]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.