Bayesian Learning of Sum-Product Networks

Sum-product networks (SPNs) are flexible density estimators and have received significant attention due to their attractive inference properties. While parameter learning in SPNs is well developed, structure learning leaves something to be desired: Even though there is a plethora of SPN structure learners, most of them are somewhat ad-hoc and based on intuition rather than a clear learning principle. In this paper, we introduce a well-principled Bayesian framework for SPN structure learning. First, we decompose the problem into i) laying out a computational graph, and ii) learning the so-called scope function over the graph. The first is rather unproblematic and akin to neural network architecture validation. The second represents the effective structure of the SPN and needs to respect the usual structural constraints in SPN, i.e. completeness and decomposability. While representing and learning the scope function is somewhat involved in general, in this paper, we propose a natural parametrisation for an important and widely used special case of SPNs. These structural parameters are incorporated into a Bayesian model, such that simultaneous structure and parameter learning is cast into monolithic Bayesian posterior inference. In various experiments, our Bayesian SPNs often improve test likelihoods over greedy SPN learners. Further, since the Bayesian framework protects against overfitting, we can evaluate hyper-parameters directly on the Bayesian model score, waiving the need for a separate validation set, which is especially beneficial in low data regimes. Bayesian SPNs can be applied to heterogeneous domains and can easily be extended to nonparametric formulations. Moreover, our Bayesian approach is the first, which consistently and robustly learns SPN structures under missing data.

[1]  Pedro M. Domingos,et al.  Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[2]  Byoung-Tak Zhang,et al.  Online Incremental Structure Learning of Sum-Product Networks , 2013, ICONIP.

[3]  Kristian Kersting,et al.  Random Sum-Product Networks: A Simple and Effective Approach to Probabilistic Deep Learning , 2019, UAI.

[4]  Ali Ghodsi,et al.  Learning the Structure of Sum-Product Networks via an SVD-based Algorithm , 2015, UAI.

[5]  Floriana Esposito,et al.  Simplifying, Regularizing and Strengthening Sum-Product Network Structure Learning , 2015, ECML/PKDD.

[6]  Kristian Kersting,et al.  Mixed Sum-Product Networks: A Deep Architecture for Hybrid Domains , 2018, AAAI.

[7]  Zoubin Ghahramani,et al.  Distributed Inference for Dirichlet Process Mixture Models , 2015, ICML.

[8]  Zhitang Chen,et al.  Discriminative Training of Sum-Product Networks by Extended Baum-Welch , 2018, PGM.

[9]  Franz Pernkopf,et al.  Greedy Part-Wise Learning of Sum-Product Networks , 2013, ECML/PKDD.

[10]  Franz Pernkopf,et al.  Optimisation of Overparametrized Sum-Product Networks , 2019, ICML 2019.

[11]  Nir Friedman,et al.  Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.

[12]  Franz Pernkopf,et al.  On the Latent Variable Interpretation in Sum-Product Networks , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Pedro M. Domingos,et al.  Discriminative Learning of Sum-Product Networks , 2012, NIPS.

[14]  Ronen Tamari,et al.  Tractable Generative Convolutional Arithmetic Circuits , 2016, 1610.04167.

[15]  Kristian Kersting,et al.  Probabilistic Deep Learning using Random Sum-Product Networks , 2018, ArXiv.

[16]  Carl E. Rasmussen,et al.  Occam's Razor , 2000, NIPS.

[17]  Dan Ventura,et al.  Greedy Structure Search for Sum-Product Networks , 2015, IJCAI.

[18]  Guy Van den Broeck,et al.  Probabilistic Sentential Decision Diagrams , 2014, KR.

[19]  Pascal Poupart,et al.  Online Structure Learning for Feed-Forward and Recurrent Sum-Product Networks , 2018, NeurIPS.

[20]  Pascal Poupart,et al.  Prometheus : Directly Learning Acyclic Directed Graph Structures for Sum-Product Networks , 2018, PGM.

[21]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[22]  Floriana Esposito,et al.  Fast and Accurate Density Estimation with Extremely Randomized Cutset Networks , 2017, ECML/PKDD.

[23]  Sang-Woo Lee,et al.  Non-Parametric Bayesian Sum-Product Networks , 2014 .

[24]  Pascal Poupart,et al.  A Unified Approach for Learning the Parameters of Sum-Product Networks , 2016, NIPS.

[25]  Daniel Lowd,et al.  Learning Sum-Product Networks with Direct and Indirect Variable Interactions , 2014, ICML.

[26]  Kristian Kersting,et al.  SPFlow: An Easy and Extensible Library for Deep Probabilistic Learning using Sum-Product Networks , 2019, ArXiv.

[27]  Joe Suzuki,et al.  A Construction of Bayesian Networks from Databases Based on an MDL Principle , 1993, UAI.

[28]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[29]  Cory J. Butz,et al.  Deep Convolutional Sum-Product Networks , 2019, AAAI.

[30]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[31]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[32]  Pedro M. Domingos,et al.  Learning Selective Sum-Product Networks , 2014 .

[33]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[34]  Sebastian Tschiatschek,et al.  On Theoretical Properties of Sum-Product Networks , 2015, AISTATS.

[35]  Pedro M. Domingos,et al.  Learning the Structure of Sum-Product Networks , 2013, ICML.

[36]  Vibhav Gogate,et al.  Look Ma, No Latent Variables: Accurate Cutset Networks via Compilation , 2019, ICML.

[37]  Guy Van den Broeck,et al.  Learning the Structure of Probabilistic Sentential Decision Diagrams , 2017, UAI.

[38]  Han Zhao,et al.  Collapsed Variational Inference for Sum-Product Networks , 2016, ICML.

[39]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[40]  F. Timmins Nursing Research Generating and Assessing Evidence for Nursing Practice , 2013 .

[41]  Kristian Kersting,et al.  Automatic Bayesian Density Analysis , 2018, AAAI.

[42]  Ronen Tamari,et al.  Tensorial Mixture Models , 2017, ArXiv.

[43]  Lorenzo Beretta,et al.  Nearest neighbor imputation algorithms: a critical evaluation , 2016, BMC Medical Informatics and Decision Making.

[44]  Chang Dong Yoo,et al.  Maximum Margin Learning of t-SPNs for Cell Classification With Filtered Input , 2016, IEEE Journal of Selected Topics in Signal Processing.

[45]  Han Zhao,et al.  Online and Distributed Bayesian Moment Matching for Parameter Learning in Sum-Product Networks , 2016, AISTATS.

[46]  Dan Ventura,et al.  Learning the Architecture of Sum-Product Networks Using Clustering on Variables , 2012, NIPS.

[47]  Han Zhao,et al.  On the Relationship between Sum-Product Networks and Bayesian Networks , 2015, ICML.

[48]  Franz Pernkopf,et al.  Safe Semi-Supervised Learning of Sum-Product Networks , 2017, UAI.

[49]  R. Trappl,et al.  Structure Inference in Sum-Product Networks using Infinite Sum-Product Trees , 2016 .

[50]  Vibhav Gogate,et al.  Cutset Networks: A Simple, Tractable, and Scalable Approach for Improving the Accuracy of Chow-Liu Trees , 2014, ECML/PKDD.