Semi-Supervised Clustering Based on Exemplars Constraints

SUMMARY In general, semi-supervised clustering can outperform un- supervised clustering. Since 2001, pairwise constraints for semi-supervised clustering have been an important paradigm in this field. In this paper, we show that pairwise constraints (ECs) can a ff ect the performance of clustering in certain situations and analyze the reasons for this in detail. To overcome these disadvantages, we first outline some exemplars constraints. Based on these constraints, we then describe a semi-supervised clustering framework, and design an exemplars constraints expectation–maximization algorithm. Finally, standard datasets are selected for experiments, and ex- perimental results are presented, which show that the exemplars constraints outperform the corresponding unsupervised clustering and semi-supervised algorithms based on pairwise constraints.

[1]  Rajendra Akerkar,et al.  Knowledge Based Systems , 2017, Encyclopedia of GIS.

[2]  Can Yang,et al.  On the Convergence of the EM Algorithm: From the Statistical Perspective , 2016 .

[3]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[4]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[5]  Seyed Mohammad Ebrhiam Message Passing Algorithms , 2014 .

[6]  Jun Yu,et al.  Pairwise constraints based multiview features fusion for scene classification , 2013, Pattern Recognit..

[7]  Xindong Wu,et al.  Extracting elite pairwise constraints for clustering , 2013, Neurocomputing.

[8]  Joydeep Ghosh,et al.  Competitive Learning With Pairwise Constraints , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Ian Davidson,et al.  Labels vs. Pairwise Constraints: A Unified View of Label Propagation and Constrained Spectral Clustering , 2012, 2012 IEEE 12th International Conference on Data Mining.

[10]  Yiu-ming Cheung,et al.  Semi-Supervised Maximum Margin Clustering with Pairwise Constraints , 2012, IEEE Transactions on Knowledge and Data Engineering.

[11]  Guillaume Cleuziou,et al.  Integrating Pairwise Constraints into Clustering Algorithms: Optimization-Based Approaches , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[12]  Seiji Yamada,et al.  Learning Similarity Matrix from Constraints of Relational Neighbors , 2010, J. Adv. Comput. Intell. Intell. Informatics.

[13]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[14]  Daoqiang Zhang,et al.  Semi-supervised clustering with metric learning: An adaptive kernel method , 2010, Pattern Recognit..

[15]  K. Wagstaff Constrained Clustering , 2010, Encyclopedia of Machine Learning.

[16]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[17]  Heli Sun,et al.  Lightly-supervised clustering using pairwise constraint propagation , 2008, 2008 3rd International Conference on Intelligent System and Knowledge Engineering.

[18]  Daoqiang Zhang,et al.  Constraint Projections for Ensemble Learning , 2008, AAAI.

[19]  Zhenguo Li,et al.  Pairwise constraint propagation by semidefinite programming for semi-supervised classification , 2008, ICML '08.

[20]  Inderjit S. Dhillon,et al.  Semi-supervised graph clustering: a kernel approach , 2005, Machine Learning.

[21]  Eric Eaton,et al.  Clustering with Propagated Constraints , 2008 .

[22]  Chris H. Q. Ding,et al.  Solving Consensus and Semi-supervised Clustering Problems Using Nonnegative Matrix Factorization , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[23]  Hui Xiong,et al.  Enhancing semi-supervised clustering: a feature projection perspective , 2007, KDD '07.

[24]  Rong Jin,et al.  Learning nonparametric kernel matrices from pairwise constraints , 2007, ICML '07.

[25]  M. Mézard Where Are the Exemplars? , 2007, Science.

[26]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[27]  Hong Chang,et al.  A Kernel Approach for Semisupervised Metric Learning , 2007, IEEE Transactions on Neural Networks.

[28]  Ian Davidson,et al.  Measuring Constraint-Set Utility for Partitional Clustering Algorithms , 2006, PKDD.

[29]  Carlotta Domeniconi,et al.  An Adaptive Kernel Method for Semi-supervised Clustering , 2006, ECML.

[30]  D. Yeung,et al.  A Kernel Approach for Semi-Supervised Metric Learning , 2006 .

[31]  Marie desJardins,et al.  Active Constrained Clustering by Examining Spectral Eigenvectors , 2005, Discovery Science.

[32]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[33]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[34]  Arindam Banerjee,et al.  Active Semi-Supervision for Pairwise Constrained Clustering , 2004, SDM.

[35]  Tomer Hertz,et al.  Computing Gaussian Mixture Models with EM Using Equivalence Constraints , 2003, NIPS.

[36]  D. Weinshall,et al.  Computing Gaussian Mixture Models with EM using Side-Information , 2003 .

[37]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[38]  Dan Klein,et al.  From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering , 2002, ICML.

[39]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[40]  É. Moulines,et al.  Convergence of a stochastic approximation version of the EM algorithm , 1999 .

[41]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.