Soft-constrained nonnegative matrix factorization via normalization

Semi-supervised clustering aims at boosting the clustering performance on unlabeled samples by using labels from a few labeled samples. Constrained NMF (CNMF) is one of the most significant semi-supervised clustering methods, and it factorizes the whole dataset by NMF and constrains those labeled samples from the same class to have identical encodings. In this paper, we propose a novel soft-constrained NMF (SCNMF) method by softening the hard constraint in CNMF. Particularly, SCNMF factorizes the whole dataset into two lower-dimensional factor matrices by using multiplicative update rule (MUR). To utilize the labels of labeled samples, SCNMF iteratively normalizes both factor matrices after updating them with MURs to make encodings of labeled samples close to their label vectors. It is therefore reasonable to believe that encodings of unlabeled samples are also close to their corresponding label vectors. Such strategy significantly boosts the clustering performance even when the labeled samples are rather limited, e.g., each class owns only a single labeled sample. Since the normalization procedure never increases the computational complexity of MUR, SCNMF is quite efficient and effective in practices. Experimental results on face image datasets illustrate both efficiency and effectiveness of SCNMF compared with both NMF and CNMF.

[1]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[2]  Zhigang Luo,et al.  NeNMF: An Optimal Gradient Method for Nonnegative Matrix Factorization , 2012, IEEE Transactions on Signal Processing.

[3]  Zhigang Luo,et al.  Non-Negative Patch Alignment Framework , 2011, IEEE Transactions on Neural Networks.

[4]  Hyeonjoon Moon,et al.  The FERET Evaluation Methodology for Face-Recognition Algorithms , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[6]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[7]  Jiawei Han,et al.  Non-negative Matrix Factorization on Manifold , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[8]  Long Lan,et al.  Box-constrained projective nonnegative matrix factorization via augmented Lagrangian method , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[9]  Zhigang Luo,et al.  Online Nonnegative Matrix Factorization With Robust Stochastic Approximation , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[10]  D. B. Graham,et al.  Characterising Virtual Eigensignatures for General Purpose Face Recognition , 1998 .

[11]  Xiang Ji,et al.  Document clustering with prior knowledge , 2006, SIGIR.

[12]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[13]  Nanning Zheng,et al.  Non-negative matrix factorization for visual coding , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[14]  Inderjit S. Dhillon,et al.  Semi-supervised graph clustering: a kernel approach , 2005, ICML '05.

[15]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[16]  C. Ding,et al.  On the Equivalence of Nonnegative Matrix Factorization and K-means - Spectral Clustering , 2005 .

[17]  Xuelong Li,et al.  Constrained Nonnegative Matrix Factorization for Image Representation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Michael W. Berry,et al.  Document clustering using nonnegative matrix factorization , 2006, Inf. Process. Manag..

[20]  Daoqiang Zhang,et al.  Semi-Supervised Dimensionality Reduction ∗ , 2007 .

[21]  Arindam Banerjee,et al.  Semi-supervised Clustering by Seeding , 2002, ICML.

[22]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[23]  Jiawei Han,et al.  Semi-supervised Discriminant Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[24]  Michael W. Berry,et al.  Text Mining Using Non-Negative Matrix Factorizations , 2004, SDM.

[25]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[26]  Jing Hua,et al.  Non-negative matrix factorization for semi-supervised data clustering , 2008, Knowledge and Information Systems.

[27]  Lawrence K. Saul,et al.  Nonnegative Matrix Factorization for Semi-supervised Dimensionality Reduction , 2011, ArXiv.