Tropical geometry of statistical models.

This article presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sum-product algorithm is an efficient tool for evaluating specific coordinates. Here, we address the question of how the solutions to various inference problems depend on the model parameters. The proposed answer is expressed in terms of tropical algebraic geometry. The Newton polytope of a statistical model plays a key role. Our results are applied to the hidden Markov model and the general Markov model on a binary tree.

[1]  George E. Andrews,et al.  A LOWER BOUND FOR THE VOLUME OF STRICTLY CONVEX BODIES WITH MANY BOUNDARY LATTICE POINTS , 1963 .

[2]  J. Felsenstein,et al.  Invariants of phylogenies in a simple case with discrete states , 1987 .

[3]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[4]  E. Lander,et al.  Parametric sequence comparisons. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Peter Gritzmann,et al.  Minkowski Addition of Polytopes: Computational Complexity and Applications to Gröbner Basis , 1993, SIAM J. Discret. Math..

[6]  Dan Gusfield,et al.  Parametric optimization of sequence alignment , 1992, SODA '92.

[7]  W. Fulton Introduction to Toric Varieties. , 1993 .

[8]  Joel E. Cohen,et al.  Nonnegative ranks, decompositions, and factorizations of nonnegative matrices , 1993 .

[9]  B. Sturmfels Gröbner bases and convex polytopes , 1995 .

[10]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[11]  Douglas B. West Acyclic orientations of complete bipartite graphs , 1995, Discret. Math..

[12]  David Haussler,et al.  A Generalized Hidden Markov Model for the Recognition of Human Genes in DNA , 1996, ISMB.

[13]  D Gusfield,et al.  Parametric and inverse-parametric sequence alignment with XPARAL. , 1996, Methods in enzymology.

[14]  Philipp Bucher,et al.  A Sequence Similarity Search Algorithm Based on a Probabilistic Interpretation of an Alignment Scoring System , 1996, ISMB.

[15]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[16]  David A. Cox,et al.  Ideals, Varieties, and Algorithms , 1997 .

[17]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[18]  Martin Farach-Colton,et al.  Combinatorial Pattern Matching , 1999, Lecture Notes in Computer Science.

[19]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[20]  H. Wynn,et al.  Algebraic Statistics: Computational Commutative Algebra in Statistics , 2000 .

[21]  R. Shamir,et al.  A fast algorithm for joint reconstruction of ancestral amino acid sequences. , 2000, Molecular biology and evolution.

[22]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[23]  Michael I. Jordan,et al.  Graphical models: Probabilistic inference , 2002 .

[24]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[25]  E. Allman,et al.  Phylogenetic invariants for the general Markov model of sequence mutation. , 2003, Mathematical biosciences.

[26]  L. Pachter,et al.  SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model. , 2003, Genome research.

[27]  D. Speyer,et al.  The Tropical Totally Positive Grassmannian , 2003, math/0312297.

[28]  B. Sturmfels,et al.  First steps in tropical geometry , 2003, math/0306366.

[29]  David Fernández-Baca,et al.  Parametric multiple sequence alignment and phylogeny construction , 2004, J. Discrete Algorithms.

[30]  Lior Pachter,et al.  Parametric inference for biological sequence analysis. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Victor Pavlovich Maslov,et al.  Idempotent Mathematics and Mathematical Physics , 2005 .

[32]  Bernd Sturmfels,et al.  Algebraic geometry of Bayesian networks , 2005, J. Symb. Comput..