论文信息 - A new descriptive clustering algorithm based on Nonnegative Matrix Factorization

A new descriptive clustering algorithm based on Nonnegative Matrix Factorization

Nonnegative matrix factorization (NMF) provides a way for finding a part-based representation of nonnegative data. An important property of NMF is that it can produce a sparse representation of the data; however, in some applications, especially in text clustering, the sparse representation always consists of separated words, which cannot explicitly express the meaning of the basis vector. This paper presents a new descriptive clustering algorithm based on NMF, called DC-NMF that can avoid this separated word problem. In our proposed method, we embrace the phrase-by-document matrix in addition to the commonly used term-by-document matrix. Then, we use conjunct gradient descent to minimize the mean squared error objective function. Finally, we describe each cluster with the highest weighted element corresponding to one particular phrase. Our experimental results indicate that our method can obtain more ldquopurerdquo clusters than other methods.

Xindong Wu | Hong Peng | Zhao Li

[1] Douglas H. Fisher,et al. Knowledge Acquisition Via Incremental Conceptual Clustering , 1987, Machine Learning.

[2] Douglas H. Fisher,et al. Knowledge acquisition via incremental conceptual clustering , 2004, Machine Learning.

[3] R. Michalski,et al. Learning from Observation: Conceptual Clustering , 1983 .

[4] Dawid Weiss,et al. Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition , 2004, Intelligent Information Systems.

[5] Patrik O. Hoyer,et al. Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[6] H. Sebastian Seung,et al. Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[7] Pat Langley,et al. Unsupervised Learning of Probabilistic Concept Hierarchies , 2001, Machine Learning and Its Applications.

[8] P. Paatero,et al. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .