Marginal likelihood and model selection for Gaussian latent tree and forest models

Gaussian latent tree models, or more generally, Gaussian latent forest models have Fisher-information matrices that become singular along interesting submodels, namely, models that correspond to subforests. For these singularities, we compute the real log-canonical thresholds (also known as stochastic complexities or learning coefficients) that quantify the large-sample behavior of the marginal likelihood in Bayesian inference. This provides the information needed for a recently introduced generalization of the Bayesian information criterion. Our mathematical developments treat the general setting of Laplace integrals whose phase functions are sums of squared differences between monomials and constants. We clarify how in this case real log-canonical thresholds can be computed using polyhedral geometry, and we show how to apply the general theory to the Laplace integrals associated with Gaussian latent tree and forest models. In simulations and a data example, we demonstrate how the mathematical knowledge can be applied in model selection.

[1]  Richard Scheines,et al.  Causation, Prediction, and Search, Second Edition , 2000, Adaptive computation and machine learning.

[2]  David Edwards,et al.  Selecting high-dimensional mixed graphical models using minimal AIC or BIC forests , 2010, BMC Bioinformatics.

[3]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[4]  Tal Pupko,et al.  A structural EM algorithm for phylogenetic inference , 2001, J. Comput. Biol..

[5]  Vincent Y. F. Tan,et al.  Learning Latent Tree Graphical Models , 2010, J. Mach. Learn. Res..

[6]  Sumio Watanabe,et al.  Algebraic Geometry and Statistical Learning Theory: Contents , 2009 .

[7]  Sumio Watanabe,et al.  Statistical Learning Theory of Quasi-Regular Cases , 2011, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[8]  Lior Pachter,et al.  Combinatorics of least-squares trees , 2008, Proceedings of the National Academy of Sciences.

[9]  Tao Jiang,et al.  On the Complexity of Comparing Evolutionary Trees , 1996, Discret. Appl. Math..

[10]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[11]  M. Plummer,et al.  A Bayesian information criterion for singular models , 2013, 1309.0911.

[12]  Dan Geiger,et al.  Asymptotic Model Selection for Naive Bayesian Networks , 2002, J. Mach. Learn. Res..

[13]  Piotr Zwiernik An Asymptotic Behaviour of the Marginal Likelihood for General Markov Models , 2011, J. Mach. Learn. Res..

[14]  Shaowei Lin,et al.  Asymptotic Approximation of Marginal Likelihood Integrals , 2010 .

[15]  Sumio Watanabe,et al.  Equations of States in Singular Statistical Estimation , 2007, Neural Networks.

[16]  Elchanan Mossel,et al.  Robust Estimation of Latent Tree Graphical Models: Inferring Hidden States With Inexact Parameters , 2011, IEEE Transactions on Information Theory.

[17]  Sumio Watanabe,et al.  Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory , 2010, J. Mach. Learn. Res..

[18]  M. Drton Likelihood ratio tests and singularities , 2007, math/0703360.

[19]  Vincent Y. F. Tan,et al.  Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates , 2010, J. Mach. Learn. Res..

[20]  Seth Sullivant,et al.  Lectures on Algebraic Statistics , 2008 .