论文信息 - Non-parametric Mixture Models for Clustering

Non-parametric Mixture Models for Clustering

Mixture models have been widely used for data clustering. However, commonly used mixture models are generally of a parametric form (e.g., mixture of Gaussian distributions or GMM), which significantly limits their capacity in fitting diverse multidimensional data distributions encountered in practice.We propose a non-parametric mixture model (NMM) for data clustering in order to detect clusters generated from arbitrary unknown distributions, using non-parametric kernel density estimates. The proposed model is non-parametric since the generative distribution of each data point depends only on the rest of the data points and the chosen kernel. A leave-one-out likelihood maximization is performed to estimate the parameters of the model. The NMM approach, when applied to cluster high dimensional text datasets significantly outperforms the state-of-the-art and classical approaches such as K-means, Gaussian Mixture Models, spectral clustering and linkage methods.

Rong Jin | Anil K. Jain | Pavan Kumar Mallapragada | Rong Jin

[1] G. Pflug. Kernel Smoothing. Monographs on Statistics and Applied Probability - M. P. Wand; M. C. Jones. , 1996 .

[2] Hans-Peter Kriegel,et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[3] Meirav Galun,et al. Fundamental Limitations of Spectral Clustering , 2006, NIPS.

[4] Anil K. Jain. Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[5] Geoffrey J. McLachlan,et al. Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[6] Jitendra Malik,et al. Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7] Pranab Kumar Sen,et al. Statistics and Decisions , 2006 .

[8] David G. Stork,et al. Pattern Classification , 1973 .

[9] John Shawe-Taylor,et al. A Framework for Probability Density Estimation , 2007, AISTATS.

[10] Ray A. Jarvis,et al. Clustering Using a Similarity Measure Based on Shared Near Neighbors , 1973, IEEE Transactions on Computers.

[11] Dorin Comaniciu,et al. Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[12] M. Opper,et al. Advanced mean field methods: theory and practice , 2001 .

[13] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14] John Langford,et al. An objective evaluation criterion for clustering , 2004, KDD.

[15] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[16] Naftali Tishby,et al. Agglomerative Information Bottleneck , 1999, NIPS.

[17] Pietro Perona,et al. Non-Parametric Probabilistic Image Segmentation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18] Anil K. Jain,et al. Algorithms for Clustering Data , 1988 .

[19] Michael I. Jordan,et al. On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[20] Shigeo Abe DrEng. Pattern Classification , 2001, Springer London.

[21] Tommi S. Jaakkola,et al. Tutorial on variational approximation methods , 2000 .

[22] Anil K. Jain,et al. Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[23] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .