Phylogenetic ideals and varieties for the general Markov model

The general Markov model of the evolution of biological sequences along a tree leads to a parameterization of an algebraic variety. Understanding this variety and the polynomials, called phylogenetic invariants, which vanish on it, is a problem within the broader area of Algebraic Statistics. For an arbitrary trivalent tree, we determine the full ideal of invariants for the 2-state model, establishing a conjecture of Pachter-Sturmfels. For the @k-state model, we reduce the problem of determining a defining set of polynomials to that of determining a defining set for a 3-leaf tree. Along the way, we prove several new cases of a conjecture of Garcia-Stillman-Sturmfels on certain statistical models on star trees, and reduce their conjecture to a family of subcases.

[1]  J A Lake,et al.  A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. , 1987, Molecular biology and evolution.

[2]  J. Weyman Cohomology of Vector Bundles and Syzygies , 2003 .

[3]  E. Allman,et al.  Quartets and Parameter Recovery for the General Markov Model of Sequence Mutation , 2004 .

[4]  J. Felsenstein,et al.  Invariants of phylogenies in a simple case with discrete states , 1987 .

[5]  J. M. Landsberg,et al.  On the Ideals of Secant Varieties of Segre Varieties , 2004, Found. Comput. Math..

[6]  L. Pachter,et al.  Tropical geometry of statistical models. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Seth Sullivant,et al.  Toric Ideals of Phylogenetic Invariants , 2004, J. Comput. Biol..

[8]  Nicholas Eriksson Toric ideals of homogeneous phylogenetic models , 2004, ISSAC '04.

[9]  Terence P. Speed,et al.  Invariants of Some Probability Models Used in Phylogenetic Inference , 1993 .

[10]  M. Hendy The Relationship Between Simple Evolutionary Tree Models and Observable Sequence Data , 1989 .

[11]  Elizabeth S. Allman,et al.  The Identifiability of Tree Topology for Phylogenetic Models, Including Covarion and Mixture Models , 2005, J. Comput. Biol..

[12]  D. Penny Inferring Phylogenies.—Joseph Felsenstein. 2003. Sinauer Associates, Sunderland, Massachusetts. , 2004 .

[13]  László A. Székely,et al.  Fourier Calculus on Evolutionary Trees , 1993 .

[14]  T. Hagedorn A Combinatorial Approach for Determining Phylogenetic Invariants for the General Model , 2001 .

[15]  V. Strassen Rank and optimal computation of generic tensors , 1983 .

[16]  László A. Székely,et al.  Reconstructing Trees When Sequence Sites Evolve at Variable Rates , 1994, J. Comput. Biol..

[17]  Michael D. Hendy,et al.  A Framework for the Quantitative Study of Evolutionary Trees , 1989 .

[18]  D. Penny,et al.  Spectral analysis of phylogenetic data , 1993 .

[19]  E. Allman,et al.  Phylogenetic invariants for the general Markov model of sequence mutation. , 2003, Mathematical biosciences.

[20]  Barbara R. Holland,et al.  Multiple maxima of likelihood in phylogenetic trees: an analytic approach , 2000, RECOMB '00.

[21]  Bernd Sturmfels,et al.  Algebraic geometry of Bayesian networks , 2005, J. Symb. Comput..

[22]  Elizabeth S. Allman,et al.  Phylogenetic invariants for stationary base composition , 2004, J. Symb. Comput..

[23]  Nicholas Eriksson,et al.  Phylogenetic Algebraic Geometry , 2004, math/0407033.