Laplacian Regularized Gaussian Mixture Model for Data Clustering

Gaussian Mixture Models (GMMs) are among the most statistically mature methods for clustering. Each cluster is represented by a Gaussian distribution. The clustering process thereby turns to estimate the parameters of the Gaussian mixture, usually by the Expectation-Maximization algorithm. In this paper, we consider the case where the probability distribution that generates the data is supported on a submanifold of the ambient space. It is natural to assume that if two points are close in the intrinsic geometry of the probability distribution, then their conditional probability distributions are similar. Specifically, we introduce a regularized probabilistic model based on manifold structure for data clustering, called Laplacian regularized Gaussian Mixture Model (LapGMM). The data manifold is modeled by a nearest neighbor graph, and the graph structure is incorporated in the maximum likelihood objective function. As a result, the obtained conditional probability distribution varies smoothly along the geodesics of the data manifold. Experimental results on real data sets demonstrate the effectiveness of the proposed approach.

[1]  Miguel Á. Carreira-Perpiñán,et al.  Proximity Graphs for Clustering and Manifold Learning , 2004, NIPS.

[2]  Beng Chin Ooi,et al.  Continuous Clustering of Moving Objects , 2007, IEEE Transactions on Knowledge and Data Engineering.

[3]  Eugenio Cesario,et al.  Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[4]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[5]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[6]  Jiawei Han,et al.  Document clustering using locality preserving indexing , 2005, IEEE Transactions on Knowledge and Data Engineering.

[7]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[8]  Qiang Yang,et al.  Self-taught clustering , 2008, ICML '08.

[9]  Deepa Paranjpe,et al.  Semi-supervised clustering with metric learning using relative comparisons , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[10]  H. Zha,et al.  Principal manifolds and nonlinear dimensionality reduction via tangent space alignment , 2004, SIAM J. Sci. Comput..

[11]  H. Sebastian Seung,et al.  The Manifold Ways of Perception , 2000, Science.

[12]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Stan Z. Li,et al.  Learning spatially localized, parts-based representation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Martine D. F. Schlag,et al.  Spectral K-Way Ratio-Cut Partitioning and Clustering , 1993, 30th ACM/IEEE Design Automation Conference.

[15]  R. Plemmons,et al.  Optimality, computation, and interpretation of nonnegative matrix factorizations , 2004 .

[16]  Deng Cai,et al.  Orthogonal locality preserving indexing , 2005, SIGIR '05.

[17]  Yihong Gong,et al.  Document clustering by concept factorization , 2004, SIGIR '04.

[18]  Mikhail Belkin,et al.  Manifold Regularization : A Geometric Framework for Learning from Examples , 2004 .

[19]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[20]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[21]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[22]  Jiawei Han,et al.  Modeling hidden topics on document manifold , 2008, CIKM '08.

[23]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[24]  Guillermo Ricardo Simari,et al.  Non-commercial Research and Educational Use including without Limitation Use in Instruction at Your Institution, Sending It to Specific Colleagues That You Know, and Providing a Copy to Your Institution's Administrator. All Other Uses, Reproduction and Distribution, including without Limitation Comm , 2022 .

[25]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[26]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[27]  Deng Cai,et al.  Probabilistic dyadic data analysis with local and global consistency , 2009, ICML '09.

[28]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[29]  Christian Böhm,et al.  Robust information-theoretic clustering , 2006, KDD '06.

[30]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2002, ArXiv.

[31]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[32]  David G. Stork,et al.  Pattern Classification , 1973 .

[33]  Mark Herbster,et al.  Combining Graph Laplacians for Semi-Supervised Learning , 2005, NIPS.

[34]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[35]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[36]  Xiaojin Zhu,et al.  Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semi-supervised learning , 2005, ICML.

[37]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Tao Li,et al.  The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering , 2006, Sixth International Conference on Data Mining (ICDM'06).

[39]  Xuelong Li,et al.  Discriminant Locally Linear Embedding With High-Order Tensor Data , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[40]  Fei Wang,et al.  Regularized clustering for documents , 2007, SIGIR.

[41]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[42]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[43]  V. Mirrokni,et al.  A recommender system based on local random walks and spectral methods , 2007, WebKDD/SNA-KDD '07.

[44]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[45]  Wei-Ying Ma,et al.  Organizing WWW images based on the analysis of page layout and Web link structure , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[46]  John M. Lee Introduction to Smooth Manifolds , 2002 .

[47]  Charu C. Aggarwal,et al.  Xproj: a framework for projected structural clustering of xml documents , 2007, KDD '07.

[48]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[49]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  S. S. Ravi,et al.  Efficient incremental constrained clustering , 2007, KDD '07.

[51]  Pablo Tamayo,et al.  Metagenes and molecular pattern discovery using matrix factorization , 2004, Proceedings of the National Academy of Sciences of the United States of America.