Two Strikes Against Perfect Phylogeny

One of the major efforts in molecular biology is the computation of phytogenies for species sets. A longstanding open problem in this area is called the Perfect Phylogeny problem. For almost two decades the complexity of this problem remained open, with progress limited to polynomial time algorithms for a few special cases, and many relaxations of the problem shown to be NP-Complete. From an applications point of view, the problem is of interest both in its general form, where the number of characters may vary, and in its fixed-parameter form. The Perfect Phylogeny problem has been shown to be equivalent to the problem of triangulating colored graphs[30]. It has also been shown recently that for a given fixed number of characters the yes-instances have bounded treewidth[45], opening the possibility of applying methodologies for bounded treewidth to the fixed-parameter form of the problem. We show that the Perfect Phylogeny problem is difficult in two different ways. We show that the general problem is NP-Complete, and we show that the various finite-state approaches for bounded treewidth cannot be applied to the fixed-parameter forms of the problem.

[1]  G. F. Estabrook,et al.  An algebraic analysis of cladistic characters , 1976, Discret. Math..

[2]  Peter Buneman,et al.  A characterisation of rigid circuit graphs , 1974, Discret. Math..

[3]  Kellogg S. Booth,et al.  Testing for the Consecutive Ones Property, Interval Graphs, and Graph Planarity Using PQ-Tree Algorithms , 1976, J. Comput. Syst. Sci..

[4]  G. Dirac On rigid circuit graphs , 1961 .

[5]  Detlef Seese,et al.  Easy Problems for Tree-Decomposable Graphs , 1991, J. Algorithms.

[6]  Sampath Kannan,et al.  Triangulating three-colored graphs , 1991, SODA '91.

[7]  E. Wilson A Consistency Test for Phylogenies Based on Contemporaneous Species , 1965 .

[8]  Bruno Courcelle,et al.  The Monadic Second-Order Logic of Graphs. I. Recognizable Sets of Finite Graphs , 1990, Inf. Comput..

[9]  Dan Gusfield,et al.  Efficient algorithms for inferring evolutionary trees , 1991, Networks.

[10]  Stefan Arnborg,et al.  Efficient algorithms for combinatorial problems on graphs with bounded decomposability — A survey , 1985, BIT.

[11]  Donald J. ROSE,et al.  On simple characterizations of k-trees , 1974, Discret. Math..

[12]  Christopher A. Meacham,et al.  Theoretical and Computational Considerations of the Compatibility of Qualitative Taxonomic Characters , 1983 .

[13]  C. Lekkeikerker,et al.  Representation of a finite graph by a set of intervals on the real line , 1962 .

[14]  Bruno Courcelle,et al.  An algebraic theory of graph reduction , 1993, JACM.

[15]  R. Graham,et al.  The steiner problem in phylogeny is NP-complete , 1982 .

[16]  Sampath Kannan,et al.  Inferring evolutionary history from DNA sequences , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[17]  Stephen T. Hedetniemi,et al.  Linear algorithms on k-terminal graphs , 1987 .

[18]  F. Gavril The intersection graphs of subtrees in tree are exactly the chordal graphs , 1974 .

[19]  M. Golumbic Algorithmic graph theory and perfect graphs , 1980 .

[20]  F. McMorris On the compatibility of binary qualitative taxonomic characters. , 1977, Bulletin of mathematical biology.

[21]  Rolf H. Möhring,et al.  The Pathwidth and Treewidth of Cographs , 1993, SIAM J. Discret. Math..

[22]  J. Felsenstein Numerical Methods for Inferring Evolutionary Trees , 1982, The Quarterly Review of Biology.

[23]  Le Quesne,et al.  The Uniquely Evolved Character Concept , 1977 .

[24]  Annegret Habel,et al.  Hyperedge Replacement: Grammars and Languages , 1992, Lecture Notes in Computer Science.

[25]  Clemens Lautemann,et al.  Efficient Algorithms on Context-Free Graph Grammars , 1988, ICALP.

[26]  Hans L. Bodlaender,et al.  Dynamic Programming on Graphs with Bounded Treewidth , 1988, ICALP.

[27]  R. Sokal,et al.  A METHOD FOR DEDUCING BRANCHING SEQUENCES IN PHYLOGENY , 1965 .

[28]  R. Sokal,et al.  Principles of numerical taxonomy , 1965 .

[29]  G. Estabrook,et al.  An idealized concept of the true cladistic character , 1975 .

[30]  D. R. Fulkerson,et al.  Incidence matrices and interval graphs , 1965 .

[31]  Andrzej Proskurowski,et al.  Separating subgraphs in k-trees: Cables and caterpillars , 1984, Discret. Math..

[32]  Christopher A. Meacham,et al.  9. Evaluating Characters by Character Compatibility Analysis , 1984 .

[33]  Walter J. Lequesne Further Studies Based on the Uniquely Derived Character Concept , 1972 .

[34]  F. McMorris,et al.  A Mathematical Foundation for the Analysis of Cladistic Character Compatibility , 1976 .

[35]  Derek G. Corneil,et al.  Complexity of finding embeddings in a k -tree , 1987 .

[36]  Robert E. Tarjan,et al.  Data structures and network algorithms , 1983, CBMS-NSF regional conference series in applied mathematics.

[37]  Bruce A. Reed,et al.  Finding approximate separators and computing tree width quickly , 1992, STOC '92.

[38]  G. Estabrook,et al.  Cladistic Methodology: A Discussion of the Theoretical Basis for the Induction of Evolutionary History , 1972 .

[39]  W. Fitch,et al.  Construction of phylogenetic trees. , 1967, Science.

[40]  D. Rose Triangulated graphs and the elimination process , 1970 .

[41]  W. J. Quesne The Uniquely Evolved Character Concept and its Cladistic Application , 1974 .

[42]  W. J. Quesne,et al.  A Method of Selection of Characters in Numerical Taxonomy , 1969 .

[43]  G. Estabrook,et al.  Compatibility Methods in Systematics , 1985 .

[44]  Stefan Arnborg,et al.  Linear time algorithms for NP-hard problems restricted to partial k-trees , 1989, Discret. Appl. Math..

[45]  Ton Kloks,et al.  A Simple Linear Time Algorithm for Triangulating Three-Colored Graphs , 1992, J. Algorithms.

[46]  Ton Kloks,et al.  Better Algorithms for the Pathwidth and Treewidth of Graphs , 1991, ICALP.

[47]  Fred R. McMorris,et al.  Triangulating vertex colored graphs , 1994, SODA '93.