Complete Parsimony Haplotype Inference Problem and Algorithms

Haplotype inference by pure parsimony (HIPP) is a well-known paradigm for haplotype inference. In order to assess the biological significance of this paradigm, we generalize the problem of HIPP to the problem of finding all optimal solutions, which we call complete HIPP. We study intrinsic haplotype features, such as backbone haplotypes and fat genotypes as well as equal columns and decomposability. We explicitly exploit these features in three computational approaches which are based on integer linear programming, depth-first branch-and-bound, and a hybrid algorithm that draws on the diverse strengths of the first two approaches. Our experimental analysis shows that our optimized algorithms are significantly superior to the baseline algorithms, often with orders of magnitude faster running time. Finally, our experiments provide some useful insights to the intrinsic features of this interesting problem.

[1]  Daniel G. Brown,et al.  Integer programming approaches to haplotype inference by pure parsimony , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  Dan Gusfield,et al.  Haplotype Inference by Pure Parsimony , 2003, CPM.

[3]  Dan Gusfield,et al.  Inference of Haplotypes from Samples of Diploid Populations: Complexity and Algorithms , 2001, J. Comput. Biol..

[4]  Inês Lynce,et al.  Efficient Haplotype Inference with Boolean Satisfiability , 2006, AAAI.

[5]  Jörg Flum,et al.  Parameterized Complexity Theory , 2006, Texts in Theoretical Computer Science. An EATCS Series.

[6]  Armin Biere,et al.  Theory and Applications of Satisfiability Testing - SAT 2006, 9th International Conference, Seattle, WA, USA, August 12-15, 2006, Proceedings , 2006, SAT.

[7]  Weixiong Zhang,et al.  A Novel Local Search Algorithm for the Traveling Salesman Problem that Exploits Backbones , 2005, IJCAI.

[8]  Rolf Niedermeier,et al.  Invitation to data reduction and problem kernelization , 2007, SIGA.

[9]  Eric Boerwinkle,et al.  Understanding the accuracy of statistical haplotype inference with sequence data of known phase , 2007, Genetic epidemiology.

[10]  Andrew G. Clark,et al.  Computational Methods for SNPs and Haplotype Inference , 2002, Lecture Notes in Computer Science.

[11]  Weixiong Zhang,et al.  Phase Transitions and Backbones of 3-SAT and Maximum 3-SAT , 2001, CP.

[12]  A. Clark,et al.  Inference of haplotypes from PCR-amplified samples of diploid populations. , 1990, Molecular biology and evolution.

[13]  Shibu Yooseph,et al.  A Survey of Computational Methods for Determining Haplotypes , 2002, Computational Methods for SNPs and Haplotype Inference.

[14]  Weixiong Zhang,et al.  Searching for backbones and fat: a limit-crossing approach with applications , 2002, AAAI/IAAI.

[15]  Martine Labbé,et al.  Solving haplotyping inference parsimony problem using a new basic polynomial formulation , 2008, Comput. Math. Appl..

[16]  Weixiong Zhang,et al.  How frugal is mother nature with haplotypes? , 2009, Bioinform..

[17]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[18]  Inês Lynce,et al.  Boosting Haplotype Inference with Local Search , 2007, Constraints.

[19]  Giuseppe Lancia,et al.  Haplotyping Populations by Pure Parsimony: Complexity of Exact and Approximation Algorithms , 2004, INFORMS J. Comput..

[20]  Toby Walsh,et al.  Backbones in Optimization and Approximation , 2001, IJCAI.

[21]  D. Gusfield,et al.  Analysis and exploration of the use of rule-based algorithms and consensus methods for the inferral of haplotypes. , 2003, Genetics.

[22]  Lusheng Wang,et al.  Haplotype inference by maximum parsimony , 2003, Bioinform..

[23]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[24]  Toby Walsh,et al.  Principles and Practice of Constraint Programming — CP 2001: 7th International Conference, CP 2001 Paphos, Cyprus, November 26 – December 1, 2001 Proceedings , 2001, Lecture Notes in Computer Science.

[25]  Inês Lynce,et al.  SAT in Bioinformatics: Making the Case with Haplotype Inference , 2006, SAT.

[26]  Weixiong Zhang,et al.  Phase Transitions and Backbones of the Asymmetric Traveling Salesman Problem , 2011, J. Artif. Intell. Res..

[27]  Weixiong Zhang,et al.  Configuration landscape analysis and backbone guided local search: Part I: Satisfiability and maximum satisfiability , 2004, Artif. Intell..