Dropout Non-negative Matrix Factorization for Independent Feature Learning

Non-negative Matrix Factorization (NMF) can learn interpretable parts-based representations of natural data, and is widely applied in data mining and machine learning area. However, NMF does not always achieve good performances as the non-negative constraint leads learned features to be non-orthogonal and overlap in semantics. How to improve the semantic independence of latent features without decreasing the interpretability of NMF is still an open research problem. In this paper, we put forward dropout NMF and its extension sequential NMF to enhance the semantic independence of NMF. Dropout NMF prevents the co-adaption of latent features to reduce ambiguity while sequential NMF can further promote the independence of individual latent features. The proposed algorithms are different from traditional regularized and weighted methods, because they require no prior knowledge and bring in no extra constraints or transformations. Extensive experiments on document clustering show that our algorithms outperform baseline methods and can be seamlessly applied to NMF based models.

[1]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[2]  Chris H. Q. Ding,et al.  Bridging Domains with Words: Opinion Analysis with Matrix Tri-factorizations , 2010, SDM.

[3]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[4]  Jiawei Han,et al.  Non-negative Matrix Factorization on Manifold , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[5]  Sida I. Wang,et al.  Dropout Training as Adaptive Regularization , 2013, NIPS.

[6]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[7]  Wu-Jun Li,et al.  Relation regularized matrix factorization , 2009, IJCAI 2009.

[8]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[9]  Amy Nicole Langville,et al.  Algorithms, Initializations, and Convergence for the Nonnegative Matrix Factorization , 2014, ArXiv.

[10]  Ryan P. Adams,et al.  Learning Ordered Representations with Nested Dropout , 2014, ICML.

[11]  Haesun Park,et al.  Sparse Nonnegative Matrix Factorization for Clustering , 2008 .

[12]  Chris H. Q. Ding,et al.  Collaborative Filtering: Weighted Nonnegative Matrix Factorization Incorporating User and Item Graphs , 2010, SDM.

[13]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[14]  Mark Liberman,et al.  THE TDT-2 TEXT AND SPEECH CORPUS , 1999 .

[15]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[16]  Brendan J. Frey,et al.  Adaptive dropout for training deep neural networks , 2013, NIPS.

[17]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[18]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[19]  Christopher D. Manning,et al.  Fast dropout training , 2013, ICML.

[20]  Koh Takeuchi,et al.  Non-Negative Multiple Matrix Factorization , 2013, IJCAI.

[21]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[22]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[23]  Chris H. Q. Ding,et al.  Non-negative Tri-factor tensor decomposition with applications , 2012, Knowledge and Information Systems.

[24]  Koh Takeuchi,et al.  Non-negative Multiple Tensor Factorization , 2013, 2013 IEEE 13th International Conference on Data Mining.

[25]  Hyunsoo Kim,et al.  Sparse Non-negative Matrix Factorizations via Alternating Non-negativity-constrained Least Squares , 2006 .

[26]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.