The Uniform Minimum-Ones 2SAT Problem and its Application to Haplotype Classification

Analyzing genomic data for finding those gene variations which are responsible for hereditary diseases is one of the great challenges in modern bioinformatics. In many living beings (including the human), every gene is present in two copies, inherited from the two parents, the so-called haplotypes . In this paper, we propose a simple combinatorial model for classifying the set of haplotypes in a population according to their responsibility for a certain genetic disease. This model is based on the minimum-ones 2SAT problem with uniform clauses. The minimum-ones 2SAT problem asks for a satisfying assignment to a satisfiable formula in 2CNF which sets a minimum number of variables to true. This problem is well-known to be -hard, even in the case where all clauses are uniform, i.e. , do not contain a positive and a negative literal. We analyze the approximability and present the first non-trivial exact algorithm for the uniform minimum-ones 2SAT problem with a running time of (1.21061 n ) on a 2SAT formula with n variables. We also show that the problem is fixed-parameter tractable by showing that our algorithm can be adapted to verify in (2 k ) time whether an assignment with at most k true variables exists.

[1]  Weijia Jia,et al.  Vertex Cover: Further Observations and Further Improvements , 2001, J. Algorithms.

[2]  Hans-Joachim Böckenhauer,et al.  Algorithmic Aspects of Bioinformatics (Natural Computing Series) , 2007 .

[3]  Subhash Khot,et al.  Vertex cover might be hard to approximate to within 2-/spl epsiv/ , 2003, 18th IEEE Annual Conference on Computational Complexity, 2003. Proceedings..

[4]  Paola Bonizzoni,et al.  The Haplotyping problem: An overview of computational models and solutions , 2003, Journal of Computer Science and Technology.

[5]  Hassan Masum Review of Algorithmics for hard problems: introduction to combinatorial optimization, randomization, approximation, and heuristics by Juraj Hromkovič. Springer 2001 , 2003, SIGA.

[6]  S. Safra,et al.  On the hardness of approximating minimum vertex cover , 2005 .

[7]  Hans-Joachim Böckenhauer,et al.  Algorithmic aspects of bioinformatics , 2007 .

[8]  George Karakostas,et al.  A better approximation ratio for the vertex cover problem , 2005, TALG.

[9]  Jun Kiniwa,et al.  Approximation of Self-stabilizing Vertex Cover Less Than 2 , 2005, Self-Stabilizing Systems.

[10]  Tao Jiang,et al.  A Survey on Haplotyping Algorithms for Tightly Linked Markers , 2008, J. Bioinform. Comput. Biol..

[11]  Robert E. Tarjan,et al.  A Linear-Time Algorithm for Testing the Truth of Certain Quantified Boolean Formulas , 1979, Inf. Process. Lett..

[12]  E. L. Lawler,et al.  Branch-and-Bound Methods: A Survey , 1966, Oper. Res..

[13]  John Michael Robson,et al.  Algorithms for Maximum Independent Sets , 1986, J. Algorithms.

[14]  Leonard Pitt,et al.  A bounded approximation for the minimum cost 2-sat problem , 1992, Algorithmica.

[15]  Juraj Hromkovic,et al.  Algorithmics for hard problems - introduction to combinatorial optimization, randomization, approximation, and heuristics , 2001 .