Beaches of Islands of Tractability: Algorithms for Parsimony and Minimum Perfect Phylogeny Haplotyping Problems

The problem Parsimony Haplotyping (PH) asks for the smallest set of haplotypes which can explain a given set of genotypes, and the problem Minimum Perfect Phylogeny Haplotyping (MPPH) asks for the smallest such set which also allows the haplotypes to be embedded in a perfect phylogeny evolutionary tree, a well-known biologically-motivated data structure. For PH we extend recent work of [17] by further mapping the interface between “easy” and “hard” instances, within the framework of (k,l)-bounded instances. By exploring, in the same way, the tractability frontier of MPPH we provide the first concrete, positive results for this problem, and the algorithms underpinning these results offer new insights about how MPPH might be further tackled in the future. In both PH and MPPH intriguing open problems remain.

[1]  Dan Gusfield,et al.  A Linear-Time Algorithm for the Perfect Phylogeny Haplotyping (PPH) Problem , 2005, RECOMB.

[2]  Shibu Yooseph,et al.  A Note on Efficient Computation of Haplotypes via Perfect Phylogeny , 2004, J. Comput. Biol..

[3]  B. Peyton,et al.  An Introduction to Chordal Graphs and Clique Trees , 1993 .

[4]  Dan Gusfield,et al.  Efficient algorithms for inferring evolutionary trees , 1991, Networks.

[5]  Luonan Chen,et al.  Models and Algorithms for Haplotyping Problem , 2006 .

[6]  Roded Sharan,et al.  Islands of Tractability for Parsimony Haplotyping , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  Leo van Iersel,et al.  On the Complexity of Several Haplotyping Problems , 2005, WABI.

[8]  Paola Bonizzoni,et al.  The Haplotyping problem: An overview of computational models and solutions , 2003, Journal of Computer Science and Technology.

[9]  Giuseppe Lancia,et al.  A polynomial case of the parsimony haplotyping problem , 2006, Oper. Res. Lett..

[10]  Maxime Crochemore,et al.  Algorithms on strings , 2007 .

[11]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[12]  Shibu Yooseph,et al.  A Survey of Computational Methods for Determining Haplotypes , 2002, Computational Methods for SNPs and Haplotype Inference.

[13]  Giuseppe Lancia,et al.  Haplotyping Populations by Pure Parsimony: Complexity of Exact and Approximation Algorithms , 2004, INFORMS J. Comput..

[14]  Daniel G. Brown,et al.  Integer programming approaches to haplotype inference by pure parsimony , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  Blair J R S,et al.  Introduction to Chordal Graphs and Clique Trees, in Graph Theory and Sparse Matrix Computation , 1997 .

[16]  Viggo Kann,et al.  Hardness of Approximating Problems on Cubic Graphs , 1997, CIAC.

[17]  Yun S. Song,et al.  Algorithms for Imperfect Phylogeny Haplotyping (IPPH) with a Single Homoplasy or Recombination Event , 2005, WABI.

[18]  Dan Gusfield,et al.  Haplotype Inference by Pure Parsimony , 2003, CPM.

[19]  Robert E. Tarjan,et al.  Algorithmic Aspects of Vertex Elimination on Graphs , 1976, SIAM J. Comput..

[20]  Amar Mukherjee,et al.  An Optimal Algorithm for Perfect Phylogeny Haplotyping , 2006, J. Comput. Biol..

[21]  Mihalis Yannakakis,et al.  Optimization, approximation, and complexity classes , 1991, STOC '88.