COMPUTATIONAL COMPLEXITY OF INFERRING PHYLOGENIES BY COMPATIBILITY

A well-known approach to inferring phylogenies involves finding a phylogeny with the largest number of characters that are perfectly compatible with it. Variations of this problem depend on whether characters are: cladistic (rooted) or qualitative (unrooted); binary (two states) or unconstrained (more than one state). The computational cost of known algorithms that guarantee solutions to these problems increases at least exponentially with problem size; practical computational considerations restrict the use of such algorithms to analyzing problems of small size. We establish that the four basic variants of the compatibility problem are all NP- complete and, thus, are so difficult computationally that for them efficient optimal algorithms are not likely to exist. (Character compatibility; computational complexity; evolutionary tree; NP-complete; phylogenetic inference.) Resume.-Une approche bien connue a l'etude de l'evolution des especes est basee sur la recherche de l'arbre phylogenetique avec lequel il y a un nombre maximal de caracteres com- patibles. Des variantes de ce probleme impliquent des caracteres soit cladistiques (avec racine), soit qualitatifs (sans racine); des caracteres soit binaires (deux valeurs), soit sans contrainte (plus qu'une valeur). Les algorithmes connus pour resoudre ces problemes exigent un temps d'ordi- nateur qui augmente de faqon exponentielle en fonction de la taille du probleme; ainsi, a toute fin pratique, on est contraint a des petits problemes. Nous demontrons que les quatre variantes du probleme de compatibilite sont toutes NP-completes, donc il est presque certain qu'ils sont trop difficiles pour que des algorithmes efficaces puissent exister.

[1]  W. H. Day Computationally difficult parsimony problems in phylogenetic systematics , 1983 .

[2]  J. S. Farris,et al.  Inferring Phylogenetic Trees from Chromosome Inversion Data , 1978 .

[3]  W. J. Quesne The Uniquely Evolved Character Concept and its Cladistic Application , 1974 .

[4]  J. Farris Phylogenetic Analysis Under Dollo's Law , 1977 .

[5]  G. Estabrook,et al.  An idealized concept of the true cladistic character , 1975 .

[6]  F. McMorris,et al.  When is one estimate of evolutionary relationships a refinement of another? , 1980 .

[7]  J. Farris Some Further Comments on Le Quesne's Methods , 1977 .

[8]  R. Graham,et al.  Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time , 1982 .

[9]  J. Felsenstein Numerical Methods for Inferring Evolutionary Trees , 1982, The Quarterly Review of Biology.

[10]  F. McMorris,et al.  A Mathematical Foundation for the Analysis of Cladistic Character Compatibility , 1976 .

[11]  Walter J. Lequesne Further Studies Based on the Uniquely Derived Character Concept , 1972 .

[12]  W. J. Quesne,et al.  A Method of Selection of Characters in Numerical Taxonomy , 1969 .

[13]  Le Quesne,et al.  The Uniquely Evolved Character Concept , 1977 .

[14]  R. Sokal,et al.  A METHOD FOR DEDUCING BRANCHING SEQUENCES IN PHYLOGENY , 1965 .

[15]  F. McMorris,et al.  When are two qualitative taxonomic characters compatible? , 1977, Journal of mathematical biology.

[16]  F. McMorris On the compatibility of binary qualitative taxonomic characters. , 1977, Bulletin of mathematical biology.

[17]  G. F. Estabrook,et al.  An algebraic analysis of cladistic characters , 1976, Discret. Math..