Closed-form Marginal Likelihood in Gamma-Poisson Matrix Factorization

We present novel understandings of the Gamma-Poisson (GaP) model, a probabilistic matrix fac-torization model for count data. We show that GaP can be rewritten free of the score/activation matrix. This gives us new insights about the estimation of the topic/dictionary matrix by maximum marginal likelihood estimation. In particular , this explains the robustness of this estima-tor to over-specified values of the factorization rank, especially its ability to automatically prune irrelevant dictionary columns, as empirically observed in previous work. The marginalization of the activation matrix leads in turn to a new Monte Carlo Expectation-Maximization algorithm with favorable properties.

[1]  Michael R. Lyu,et al.  Probabilistic factor models for web site recommendation , 2011, SIGIR.

[2]  David B. Dunson,et al.  Beta-Negative Binomial Process and Poisson Factor Analysis , 2011, AISTATS.

[3]  Ali Taylan Cemgil,et al.  Nonnegative matrix factorizations as probabilistic inference in composite models , 2009, 2009 17th European Signal Processing Conference.

[4]  O. Cappé,et al.  Efficient Markov chain Monte Carlo inference in composite models with space alternating data augmentation , 2011, 2011 IEEE Statistical Signal Processing Workshop (SSP).

[5]  Aleks Jakulin,et al.  Discrete Component Analysis , 2005, SLSFS.

[6]  J. Sherman,et al.  Adjustment of an Inverse Matrix Corresponding to a Change in One Element of a Given Matrix , 1950 .

[7]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[8]  John F. Canny,et al.  GaP: a factor model for discrete data , 2004, SIGIR '04.

[9]  Lawrence Carin,et al.  Negative Binomial Process Count and Mixture Modeling , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[11]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[12]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[13]  David M. Blei,et al.  Scalable Recommendation with Hierarchical Poisson Factorization , 2015, UAI.

[14]  Simon J. Godsill,et al.  Bayesian extensions to non-negative matrix factorisation for audio signal modelling , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  O. Cappé,et al.  On‐line expectation–maximization algorithm for latent data models , 2009 .

[16]  E. Kuhn,et al.  Coupling a stochastic approximation version of EM with an MCMC procedure , 2004 .

[17]  Onur Dikmen,et al.  Nonnegative dictionary learning in the exponential noise model for adaptive music signal representation , 2011, NIPS.

[18]  Masaaki Sibuya,et al.  Negative multinomial distribution , 1964 .

[19]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .