Inferring Phylogenetic Trees Using Answer Set Programming

We describe the reconstruction of a phylogeny for a set of taxa, with a character-based cladistics approach, in a declarative knowledge representation formalism, and show how to use computational methods of answer set programming to generate conjectures about the evolution of the given taxa. We have applied this computational method in two domains: historical analysis of languages and historical analysis of parasite-host systems. In particular, using this method, we have computed some plausible phylogenies for Chinese dialects, for Indo-European language groups, and for Alcataenia species. Some of these plausible phylogenies are different from the ones computed by other software. Using this method, we can easily describe domain-specific information (e.g., temporal and geographical constraints), and thus prevent the reconstruction of some phylogenies that are not plausible.

[1]  Daniel R. Brooks,et al.  The Phylogenetic Perspective. (Book Reviews: Phylogeny, Ecology, and Behavior. A Research Program in Comparative Biology.) , 1991 .

[2]  R. Graham,et al.  The steiner problem in phylogeny is NP-complete , 1982 .

[3]  Esra Erdem,et al.  Tight logic programs , 2003, Theory and Practice of Logic Programming.

[4]  J. Kruskal,et al.  An Indoeuropean classification : a lexicostatistical experiment , 1992 .

[5]  R L Mayden,et al.  Phylogeny and biodiversity: Conserving our evolutionary legacy. , 1992, Trends in ecology & evolution.

[6]  J. Lloyd Foundations of Logic Programming , 1984, Symbolic Computation.

[7]  Vladimir Lifschitz,et al.  Definitions in Answer Set Programming: (Extended Abstract) , 2003, ICLP.

[8]  Michael R. Fellows,et al.  The hardness of perfect phylogeny, feasible register assignment and other problems on thin colored graphs , 2000, Theor. Comput. Sci..

[9]  Book Reviews,et al.  The Bronze Age and Early Iron Age Peoples of Eastern Central Asia , 1998 .

[10]  Daniel Frynta,et al.  Cladistic analysis of languages: Indo‐European classification based on lexicostatistical data , 2003 .

[11]  Timo Soininen,et al.  Extending and implementing the stable model semantics , 2000, Artif. Intell..

[12]  Rhys Jones,et al.  Thermoluminescence dating of a 50,000-year-old human occupation site in northern Australia , 1990, Nature.

[13]  Vladimir Lifschitz,et al.  Answer set programming and plan generation , 2002, Artif. Intell..

[14]  Vladimir Lifschitz,et al.  Mathematical Foundations of Answer Set Programming , 2005, We Will Show Them!.

[15]  James W. Minett,et al.  On detecting borrowing: distance-based and character-based , 2003 .

[16]  W. Hennig Grundzüge einer Theorie der phylogenetischen Systematik , 1950 .

[17]  Victor W. Marek,et al.  Stable models and an alternative logic programming paradigm , 1998, The Logic Programming Paradigm.

[18]  Mark Ridley,et al.  Phylogeny, ecology, and behavior: A research program in comparative biology , 1991 .

[19]  M. P. Cummings PHYLIP (Phylogeny Inference Package) , 2004 .

[20]  T. Warnow,et al.  Perfect Phylogenetic Networks: A New Methodology for Reconstructing the Evolutionary History of Natural Languages , 2005 .

[21]  Donald A. Ringe join On Calculating the Factor of Chance in Language Comparison , 1992 .

[22]  Vladimir Lifschitz,et al.  Weight constraints as nested expressions , 2003, Theory and Practice of Logic Programming.

[23]  Esra Erdem,et al.  Character-Based Cladistics and Answer Set Programming , 2005, PADL.

[24]  Mahé Ben Hamed Neighbour-nets portray the Chinese dialect continuum and the linguistic legacy of China's demic history , 2005, Proceedings of the Royal Society B: Biological Sciences.

[25]  E. Hoberg,et al.  Congruent and synchronic patterns in biogeography and speciation among seabirds, pinnipeds, and cestodes. , 1992, The Journal of parasitology.

[26]  Eric P. Hoberg,et al.  Evolution and Historical Biogeography of a Parasite-Host Assemblage: Alcataenia spp. (Cyclophyllidea: Dilepididae) in Alcidae (Charadriiformes) , 1986 .

[27]  R. Sokal,et al.  A METHOD FOR DEDUCING BRANCHING SEQUENCES IN PHYLOGENY , 1965 .

[28]  George C. Steyskal Grundzuege Einer Theorie Der Phylogenetischen Systematik , 1952 .

[29]  Michael Gelfond,et al.  Classical negation in logic programs and disjunctive databases , 1991, New Generation Computing.

[30]  Sharad Malik,et al.  Chaff: engineering an efficient SAT solver , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[31]  K. Lange Reconstruction of Evolutionary Trees , 1997 .

[32]  Vladimir Lifschitz,et al.  Nested expressions in logic programs , 1999, Annals of Mathematics and Artificial Intelligence.

[33]  J. Mallory,et al.  In Search of the Indo-Europeans , 1989 .

[34]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[35]  David Sankoff,et al.  COMPUTATIONAL COMPLEXITY OF INFERRING PHYLOGENIES BY COMPATIBILITY , 1986 .

[36]  Esra Erdem,et al.  Reconstructing the Evolutionary History of Indo-European Languages Using Answer Set Programming , 2003, PADL.

[37]  Tandy Warnow,et al.  Indo‐European and Computational Cladistics , 2002 .

[38]  Esra Erdem,et al.  Temporal phylogenetic networks and logic programming , 2005, Theory and Practice of Logic Programming.

[39]  James F. O'Connell,et al.  A prehistory of Australia, New Guinea, and Sahul , 1982 .

[40]  Lothar von Falkenhausen,et al.  The Bronze Age and Early Iron Age Peoples of Eastern Central Asia: Volume II: Genetics and Physical Anthropology, Metallurgy, Textiles, Geography and Climatology, History, and Mythology and Ethnology , 1999 .

[41]  W. Hennig Phylogenetic Systematics , 2002 .

[42]  J. Felsenstein Numerical Methods for Inferring Evolutionary Trees , 1982, The Quarterly Review of Biology.