Nonparametric Tree Graphical Models via Kernel Embeddings

We introduce a nonparametric representation for graphical model on trees which expresses marginals as Hilbert space embeddings and conditionals as embedding operators. This formulation allows us to dene a graphical model solely on the basis of the feature space representation of its variables. Thus, this nonparametric model can be applied to general domains where kernels are dened, handling challenging cases such as discrete variables with huge domains, or very complex, non-Gaussian continuous distributions. We also derive kernel belief propagation, a Hilbert-space algorithm for performing inference in our model. We show that our method outperforms state-of-the-art techniques in a cross-lingual document retrieval task and a camera rotation estimation problem.

[1]  C. Baker Joint measures and cross-covariance operators , 1973 .

[2]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[3]  Kenneth Ward Church,et al.  A Program for Aligning Sentences in Bilingual Corpora , 1993, CL.

[4]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[5]  William T. Freeman,et al.  Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology , 1999, Neural Computation.

[6]  Pedro Larrañaga,et al.  An Introduction to Probabilistic Graphical Models , 2002, Estimation of Distribution Algorithms.

[7]  William T. Freeman,et al.  Nonparametric belief propagation , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[9]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[10]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[11]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[12]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[13]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[14]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[15]  Bernhard Schölkopf,et al.  Kernel Measures of Conditional Dependence , 2007, NIPS.

[16]  Le Song,et al.  A Hilbert Space Embedding for Distributions , 2007, Discovery Science.

[17]  Kenji Fukumizu,et al.  Statistical Consistency of Kernel Canonical Correlation Analysis , 2007 .

[18]  Visa Koivunen,et al.  Steepest Descent Algorithms for Optimization Under Unitary Matrix Constraint , 2008, IEEE Transactions on Signal Processing.

[19]  Bernhard Schölkopf,et al.  Injective Hilbert Space Embeddings of Probability Measures , 2008, COLT.

[20]  Alexander J. Smola,et al.  Hilbert space embeddings of conditional distributions with applications to dynamical systems , 2009, ICML '09.

[21]  Andrew McCallum,et al.  Polylingual Topic Models , 2009, EMNLP.

[22]  David A. McAllester,et al.  Particle Belief Propagation , 2009, AISTATS.