A Double Parametric Bootstrap Test for Topic Models

Non-negative matrix factorization (NMF) is a technique for finding latent representations of data. The method has been applied to corpora to construct topic models. However, NMF has likelihood assumptions which are often violated by real document corpora. We present a double parametric bootstrap test for evaluating the fit of an NMF-based topic model based on the duality of the KL divergence and Poisson maximum likelihood estimation. The test correctly identifies whether a topic model based on an NMF approach yields reliable results in simulated and real data.

[1]  Hirokazu Kameoka Non-negative Matrix Factorization and Its Variants for Audio Signal Processing , 2016 .

[2]  David M. Blei,et al.  Bayesian Checking for Topic Models , 2011, EMNLP.

[3]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[4]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[5]  Ali Taylan Cemgil,et al.  Bayesian Inference for Nonnegative Matrix Factorisation Models , 2009, Comput. Intell. Neurosci..

[6]  C. F. Wu JACKKNIFE , BOOTSTRAP AND OTHER RESAMPLING METHODS IN REGRESSION ANALYSIS ' BY , 2008 .

[7]  Michael W. Berry,et al.  Text Mining Using Non-Negative Matrix Factorizations , 2004, SDM.

[8]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[9]  James G. MacKinnon,et al.  Improving the Reliability of Bootstrap Tests , 2000 .

[10]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[11]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[12]  E. Mammen Bootstrap and Wild Bootstrap for High Dimensional Linear Models , 1993 .

[13]  R. Beran Prepivoting Test Statistics: A Bootstrap View of Asymptotic Refinements , 1988 .

[14]  R. Beran Prepivoting to reduce level error of confidence sets , 1987 .

[15]  L. Brown Fundamentals of statistical exponential families: with applications in statistical decision theory , 1986 .