The complexity of rooted phylogeny problems

Several computational problems in phylogenetic reconstruction can be formulated as restrictions of the following general problem: given a formula in conjunctive normal form where the atomic formulas are rooted triples, is there a rooted binary tree that satisfies the formula? If the formulas do not contain disjunctions and negations, the problem becomes the famous rooted triple consistency problem, which can be solved in polynomial time by an algorithm of Aho, Sagiv, Szymanski, and Ullman. If the clauses in the formulas are restricted to disjunctions of negated triples, Ng, Steel, and Wormald showed that the problem remains NP-complete. We systematically study the computational complexity of the problem for all such restrictions of the clauses in the input formula. For certain restricted disjunctions of triples we present an algorithm that has sub-quadratic running time and is asymptotically as fast as the fastest known algorithm for the rooted triple consistency problem. We also show that any restriction of the general rooted phylogeny problem that does not fall into our tractable class is NP-complete, using known results about the complexity of Boolean constraint satisfaction problems. Finally, we present a pebble game argument that shows that the rooted triple consistency problem (and also all generalizations studied in this paper) cannot be solved by Datalog.

[1]  M. Steel The complexity of reconstructing trees from qualitative characters and subtrees , 1992 .

[2]  Sanjeev Khanna,et al.  Complexity classifications of Boolean constraint satisfaction problems , 2001, SIAM monographs on discrete mathematics and applications.

[3]  Robin Hirsch,et al.  Expressive Power and Complexity in Algebraic Logic , 1997, J. Log. Comput..

[4]  H. E. Vaughan Review: Emil L. Post, The Two-valued Iterative Systems of Mathematical Logic , 1941, Journal of Symbolic Logic.

[5]  Mikkel Thorup,et al.  Poly-logarithmic deterministic fully-dynamic graph algorithms I: connectivity and minimum spanning tree , 1997 .

[6]  Alan K. Mackworth Consistency in Networks of Relations , 1977, Artif. Intell..

[7]  P. Jeavons Structural Theory of Automata‚ Semigroups‚ and Universal Algebra , 2003 .

[8]  Ugo Montanari,et al.  Networks of constraints: Fundamental properties and applications to picture processing , 1974, Inf. Sci..

[9]  Phokion G. Kolaitis,et al.  On the expressive power of datalog: tools and a case study , 1990, J. Comput. Syst. Sci..

[10]  Chen C. Chang,et al.  Model Theory: Third Edition (Dover Books On Mathematics) By C.C. Chang;H. Jerome Keisler;Mathematics , 1966 .

[11]  Emil L. Post The two-valued iterative systems of mathematical logic , 1942 .

[12]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[13]  Nicholas C. Wormald,et al.  The Difficulty of Constructing a Leaf-labelled Tree Including or Avoiding Given Subtrees , 2000, Discret. Appl. Math..

[14]  P. Cameron,et al.  Oligomorphic permutation groups , 1990 .

[15]  Neil Immerman,et al.  Descriptive Complexity and Model Checking , 1998, FSTTCS.

[16]  Ivo Düntsch,et al.  Relation Algebras and their Application in Temporal and Spatial Reasoning , 2005, Artificial Intelligence Review.

[17]  Tomás Feder,et al.  The Computational Structure of Monotone Monadic SNP and Constraint Satisfaction: A Study through Datalog and Group Theory , 1999, SIAM J. Comput..

[18]  Norman Biggs,et al.  Constructions for Cubic Graphs with Large Girth , 1998, Electron. J. Comb..

[19]  P. Jeavons,et al.  The complexity of constraint satisfaction : an algebraic approach. , 2005 .

[20]  Manuel Bodirsky,et al.  A fast algorithm and datalog inexpressibility for temporal reasoning , 2010, TOCL.

[21]  D. Bryant Building trees, hunting for trees, and comparing trees : theory and methods in phylogenetic analysis , 1997 .

[22]  Wilfrid Hodges,et al.  A Shorter Model Theory , 1997 .

[23]  Phokion G. Kolaitis,et al.  Conjunctive-query containment and constraint satisfaction , 1998, PODS.

[24]  Manuel Bodirsky Constraint Satisfaction Problems with Infinite Templates , 2008, Complexity of Constraints.

[25]  Mihalis Yannakakis,et al.  On Datalog vs. Polynomial Time , 1995, J. Comput. Syst. Sci..

[26]  Y. Gurevich On Finite Model Theory , 1990 .

[27]  Constructing a Tree from Homeomorphic Subtrees, with , 1999 .

[28]  Manuel Bodirsky,et al.  Determining the consistency of partial tree descriptions , 2007, Artif. Intell..

[29]  Peter M. Neumann,et al.  Relations related to betweenness : their structure and automorphisms , 1998 .

[30]  Jaroslav Nesetril,et al.  Constraint Satisfaction with Countable Homogeneous Templates , 2003, J. Log. Comput..

[31]  Manfred Droste,et al.  Structure of partially ordered sets with transitive automorphism groups , 1985 .

[32]  M. Steel,et al.  Extension Operations on Sets of Leaf-Labeled Trees , 1995 .

[33]  Manuel Bodirsky,et al.  Datalog and Constraint Satisfaction with Infinite Templates , 2006, STACS.

[34]  Rina Dechter,et al.  From Local to Global Consistency , 1990, Artif. Intell..

[35]  Nadia Creignou,et al.  On Generating All Solutions of Generalized Satisfiability Problems , 1997, RAIRO Theor. Informatics Appl..

[36]  Neil Immerman,et al.  Descriptive Complexity , 1999, Graduate Texts in Computer Science.

[37]  Alfred V. Aho,et al.  Inferring a Tree from Lowest Common Ancestors with an Application to the Optimization of Relational Expressions , 1981, SIAM J. Comput..

[38]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[39]  Tandy J. Warnow,et al.  Constructing a Tree from Homeomorphic Subtrees, with Applications to Computational Evolutionary Biology , 1996, SODA '96.

[40]  Tom Cornell On Determining the Consistency of Partial Descriptions of Trees , 1994, ACL.

[41]  Peter Jonsson,et al.  Point algebras for temporal reasoning: Algorithms and complexity , 2003, Artif. Intell..

[42]  Jaroslav Nesetril,et al.  The core of a graph , 1992, Discret. Math..