Linear and Nonlinear Projective Nonnegative Matrix Factorization

A variant of nonnegative matrix factorization (NMF) which was proposed earlier is analyzed here. It is called projective nonnegative matrix factorization (PNMF). The new method approximately factorizes a projection matrix, minimizing the reconstruction error, into a positive low-rank matrix and its transpose. The dissimilarity between the original data matrix and its approximation can be measured by the Frobenius matrix norm or the modified Kullback-Leibler divergence. Both measures are minimized by multiplicative update rules, whose convergence is proven for the first time. Enforcing orthonormality to the basic objective is shown to lead to an even more efficient update rule, which is also readily extended to nonlinear cases. The formulation of the PNMF objective is shown to be connected to a variety of existing NMF methods and clustering approaches. In addition, the derivation using Lagrangian multipliers reveals the relation between reconstruction and sparseness. For kernel principal component analysis (PCA) with the binary constraint, useful in graph partitioning problems, the nonlinear kernel PNMF provides a good approximation which outperforms an existing discretization approach. Empirical study on three real-world databases shows that PNMF can achieve the best or close to the best in clustering. The proposed algorithm runs more efficiently than the compared NMF methods, especially for high-dimensional data. Moreover, contrary to the basic NMF, the trained projection matrix can be readily used for newly coming samples and demonstrates good generalization.

[1]  Chris H. Q. Ding,et al.  K-means clustering via principal component analysis , 2004, ICML.

[2]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Stan Z. Li,et al.  Local non-negative matrix factorization as a visual representation , 2002, Proceedings 2nd International Conference on Development and Learning. ICDL 2002.

[4]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Daniel D. Lee,et al.  Multiplicative Updates for Nonnegative Quadratic Programming in Support Vector Machines , 2002, NIPS.

[6]  Juha Karhunen,et al.  Generalizations of principal component analysis, optimization problems, and neural networks , 1995, Neural Networks.

[7]  Jorma Laaksonen,et al.  Projective Non-Negative Matrix Factorization with Applications to Facial Image Processing , 2007, Int. J. Pattern Recognit. Artif. Intell..

[8]  Erkki Oja,et al.  Principal components, minor components, and linear neural networks , 1992, Neural Networks.

[9]  M. Fiedler Algebraic connectivity of graphs , 1973 .

[10]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[11]  Samuel Kaski,et al.  Discriminative clustering , 2005, Neurocomputing.

[12]  Pablo Tamayo,et al.  Metagenes and molecular pattern discovery using matrix factorization , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[14]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[15]  Daniel D. Lee,et al.  Multiplicative Updates for Large Margin Classifiers , 2003, COLT.

[16]  Baowen Xu,et al.  A constrained non-negative matrix factorization in information retrieval , 2003, Proceedings Fifth IEEE Workshop on Mobile Computing Systems and Applications.

[17]  Nanning Zheng,et al.  Non-negative matrix factorization for visual coding , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[18]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[19]  Jorma Laaksonen,et al.  Multiplicative updates for non-negative projections , 2007, Neurocomputing.

[20]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[21]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[23]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[24]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[25]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[26]  Jonathan Foote,et al.  Summarizing video using non-negative similarity matrix factorization , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[27]  Chris H. Q. Ding,et al.  On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering , 2005, SDM.

[28]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[29]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[30]  Jianbo Shi,et al.  Multiclass spectral clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[31]  Hyunsoo Kim,et al.  Sparse Non-negative Matrix Factorizations via Alternating Non-negativity-constrained Least Squares , 2006 .

[32]  Erkki Oja,et al.  Projective Nonnegative Matrix Factorization : Sparseness , Orthogonality , and Clustering , 2009 .

[33]  Erkki Oja,et al.  Projective Nonnegative Matrix Factorization for Image Compression and Feature Extraction , 2005, SCIA.

[34]  Stefan M. Wild,et al.  Improving non-negative matrix factorizations through structured initialization , 2004, Pattern Recognit..

[35]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[36]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[37]  Tao Li,et al.  The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering , 2006, Sixth International Conference on Data Mining (ICDM'06).