Collapsed Variational Inference for Sum-Product Networks

Sum-Product Networks (SPNs) are probabilistic inference machines that admit exact inference in linear time in the size of the network. Existing parameter learning approaches for SPNs are largely based on the maximum likelihood principle and are subject to overfitting compared to more Bayesian approaches. Exact Bayesian posterior inference for SPNs is computationally intractable. Even approximation techniques such as standard variational inference and posterior sampling for SPNs are computationally infeasible even for networks of moderate size due to the large number of local latent variables per instance. In this work, we propose a novel deterministic collapsed variational inference algorithm for SPNs that is computationally efficient, easy to implement and at the same time allows us to incorporate prior information into the optimization formulation. Extensive experiments show a significant improvement in accuracy compared with a maximum likelihood based approach.

[1]  Han Zhao,et al.  On the Relationship between Sum-Product Networks and Bayesian Networks , 2015, ICML.

[2]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[3]  M. Titsias Local Expectation Gradients for Doubly Stochastic Variational Inference , 2015, 1503.01494.

[4]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[5]  Pedro M. Domingos,et al.  Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[6]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[7]  F. Wilcoxon SOME RAPID APPROXIMATE STATISTICAL PROCEDURES , 1950 .

[8]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[9]  Yee Whye Teh,et al.  A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation , 2006, NIPS.

[10]  Franz Pernkopf,et al.  On the Latent Variable Interpretation in Sum-Product Networks , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Pascal Poupart,et al.  A Unified Approach for Learning the Parameters of Sum-Product Networks , 2016, NIPS.

[12]  Miguel Lázaro-Gredilla,et al.  Doubly Stochastic Variational Bayes for non-Conjugate Inference , 2014, ICML.

[13]  Stephen P. Boyd,et al.  A tutorial on geometric programming , 2007, Optimization and Engineering.

[14]  Mung Chiang,et al.  Geometric Programming for Communication Systems , 2005, Found. Trends Commun. Inf. Theory.

[15]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[16]  Daniel Lowd,et al.  Learning Sum-Product Networks with Direct and Indirect Variable Interactions , 2014, ICML.

[17]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  I JordanMichael,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008 .

[20]  Han Zhao,et al.  Online and Distributed Bayesian Moment Matching for Parameter Learning in Sum-Product Networks , 2016, AISTATS.

[21]  Sebastian Tschiatschek,et al.  On Theoretical Properties of Sum-Product Networks , 2015, AISTATS.

[22]  Pedro M. Domingos,et al.  Learning the Structure of Sum-Product Networks , 2013, ICML.

[23]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[24]  Yee Whye Teh,et al.  Collapsed Variational Inference for HDP , 2007, NIPS.

[25]  Pedro M. Domingos,et al.  Discriminative Learning of Sum-Product Networks , 2012, NIPS.