Learning the Structure of Sum-Product Networks via an SVD-based Algorithm

Sum-product networks (SPNs) are a recently developed class of deep probabilistic models where inference is tractable. We present two new structure learning algorithms for sum-product networks, in the generative and discriminative settings, that are based on recursively extracting rank-one submatrices from data. The proposed algorithms find the subSPNs that are the most coherent jointly in the instances and variables - that is, whose instances are most strongly correlated over the given variables. Experimental results show that SPNs learned using the proposed generative algorithm have better likelihood and inference results - and also much faster - than previous approaches. Finally, we apply the discriminative SPN structure learning algorithm to handwritten digit recognition tasks, where it achieves state-of-the-art performance for an SPN.

[1]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[2]  Francis R. Bach,et al.  Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..

[3]  Franz Pernkopf,et al.  Greedy Part-Wise Learning of Sum-Product Networks , 2013, ECML/PKDD.

[4]  Zohreh Azimifar,et al.  Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds , 2011, Pattern Recognit..

[5]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[6]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[7]  Pedro M. Domingos,et al.  Discriminative Learning of Sum-Product Networks , 2012, NIPS.

[8]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[9]  Pedro M. Domingos,et al.  Learning Selective Sum-Product Networks , 2014 .

[10]  Dan Ventura,et al.  Learning the Architecture of Sum-Product Networks Using Clustering on Variables , 2012, NIPS.

[11]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[12]  Le Song,et al.  Supervised feature selection via dependence estimation , 2007, ICML '07.

[13]  Ali Ghodsi,et al.  Nonnegative matrix factorization via rank-one downdate , 2008, ICML '08.

[14]  Pedro M. Domingos,et al.  Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[15]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[16]  Daniel Lowd,et al.  Learning Sum-Product Networks with Direct and Indirect Variable Interactions , 2014, ICML.

[17]  Ayhan Demiriz,et al.  Linear Programming Boosting via Column Generation , 2002, Machine Learning.

[18]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[20]  Pedro M. Domingos,et al.  Learning Tractable Statistical Relational Models , 2014, StarAI@AAAI.

[21]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Pedro M. Domingos,et al.  Learning the Structure of Sum-Product Networks , 2013, ICML.

[23]  Byoung-Tak Zhang,et al.  Online Incremental Structure Learning of Sum-Product Networks , 2013, ICONIP.