Merging Partially Labelled Trees: Hardness and a Declarative Programming Solution

Intraspecific studies often make use of haplotype networks instead of gene genealogies to represent the evolution of a set of genes. Cassens et al. proposed one such network reconstruction method, based on the global maximum parsimony principle, which was later recast by the first author of the present work as the problem of finding a minimum common supergraph of a set of t partially labelled trees. Although algorithms have been proposed for solving that problem on two graphs, the complexity of the general problem on trees remains unknown. In this paper, we show that the corresponding decision problem is NP-complete for t=3. We then propose a declarative programming approach to solving the problem to optimality in practice, as well as a heuristic approach, both based on the idpsystem, and assess the performance of both methods on randomly generated data.

[1]  Frank van Harmelen,et al.  Handbook of Knowledge Representation , 2008, Handbook of Knowledge Representation.

[2]  Daniel H. Huson,et al.  Phylogenetic Networks - Concepts, Algorithms and Applications , 2011 .

[3]  Cesare Tinelli,et al.  Handbook of Satisfiability , 2021, Handbook of Satisfiability.

[4]  Johan Wittocx,et al.  Grounding FO and FO(ID) with Bounds , 2010, J. Artif. Intell. Res..

[5]  Abraham Kandel,et al.  On the Minimum Common Supergraph of Two Graphs , 2000, Computing.

[6]  Thomas J. Schaefer,et al.  The complexity of satisfiability problems , 1978, STOC.

[7]  Daniel H. Huson,et al.  Phylogenetic Networks: Contents , 2010 .

[8]  David Morrison,et al.  Who is Who in Phylogenetic Networks: Articles, Authors and Programs , 2016, ArXiv.

[9]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[10]  Luay Nakhleh,et al.  Phylogenetic networks , 2004 .

[11]  Johan Wittocx,et al.  The IDP system: A model expansion system for an extension of classical logic , 2008 .

[12]  Daniel H. Huson,et al.  Phylogenetic Networks: Algorithms and applications , 2011 .

[13]  Reinhard Diestel,et al.  Graph Theory , 1997 .

[14]  A. Paone,et al.  Discrete Time Relaxation Based on Direct Quadrature Methods for Volterra Integral Equations , 1999, Computing.

[15]  Patrick Mardulyn,et al.  Evaluating intraspecific "network" construction methods using simulated sequence data: do existing algorithms outperform the global maximum parsimony approach? , 2005, Systematic biology.

[16]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[17]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[18]  H. Bandelt,et al.  Mitochondrial portraits of human populations using median networks. , 1995, Genetics.

[19]  Anthony Labarre,et al.  Combinatorial aspects of genome rearrangements and haplotype networks , 2008 .

[20]  Bart Selman,et al.  Satisfiability Solvers , 2008, Handbook of Knowledge Representation.

[21]  Maurice Bruynooghe,et al.  SAT(ID): Satisfiability of Propositional Logic Extended with Inductive Definitions , 2008, SAT.