Semi-Supervised Maximum Margin Clustering with Pairwise Constraints

The pairwise constraints specifying whether a pair of samples should be grouped together or not have been successfully incorporated into the conventional clustering methods such as k-means and spectral clustering for the performance enhancement. Nevertheless, the issue of pairwise constraints has not been well studied in the recently proposed maximum margin clustering (MMC), which extends the maximum margin framework in supervised learning for clustering and often shows a promising performance. This paper therefore proposes a pairwise constrained MMC algorithm. Based on the maximum margin idea in MMC, we propose a set of effective loss functions for discouraging the violation of given pairwise constraints. For the resulting optimization problem, we show that the original nonconvex problem in our approach can be decomposed into a sequence of convex quadratic program problems via constrained concave-convex procedure (CCCP). Subsequently, we present an efficient subgradient projection optimization method to solve each convex problem in the CCCP sequence. Experiments on a number of real-world data sets show that the proposed constrained MMC algorithm is scalable and outperforms the existing constrained MMC approach as well as the typical semi-supervised clustering counterparts.

[1]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[2]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[3]  Fei Wang,et al.  Efficient multiclass maximum margin clustering , 2008, ICML '08.

[4]  Jason Weston,et al.  Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[5]  Alexander Zien,et al.  Transductive support vector machines for structured variables , 2007, ICML '07.

[6]  Jason Weston,et al.  Large scale manifold transduction , 2008, ICML '08.

[7]  Ivor W. Tsang,et al.  Maximum Margin Clustering Made Practical , 2007, IEEE Transactions on Neural Networks.

[8]  Rong Yan,et al.  A discriminative learning framework with pairwise constraints for video object classification , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[10]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[11]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Thomas Hofmann,et al.  Kernel Methods for Missing Variables , 2005, AISTATS.

[13]  A. Banerjee Convex Analysis and Optimization , 2006 .

[14]  Rong Jin,et al.  Learning nonparametric kernel matrices from pairwise constraints , 2007, ICML '07.

[15]  Deepa Paranjpe,et al.  Semi-supervised clustering with metric learning using relative comparisons , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[16]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[17]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[18]  Anil K. Jain,et al.  Model-based Clustering With Probabilistic Constraints , 2005, SDM.

[19]  Rich Caruana,et al.  Improving Classification with Pairwise Constraints: A Margin-Based Approach , 2008, ECML/PKDD.

[20]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[21]  Rong Jin,et al.  Generalized Maximum Margin Clustering and Unsupervised Kernel Learning , 2006, NIPS.

[22]  Dale Schuurmans,et al.  Unsupervised and Semi-Supervised Multi-Class Support Vector Machines , 2005, AAAI.

[23]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[24]  Dale Schuurmans,et al.  Maximum Margin Clustering , 2004, NIPS.

[25]  Chris Buckley,et al.  OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[26]  Rong Ge,et al.  Joint Cluster Analysis of Attribute Data and Relationship Data: the Connected k-Center Problem , 2006, SDM.

[27]  Nenghai Yu,et al.  Maximum Margin Clustering with Pairwise Constraints , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[28]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[29]  Tomer Hertz,et al.  Computing Gaussian Mixture Models with EM Using Equivalence Constraints , 2003, NIPS.

[30]  Inderjit S. Dhillon,et al.  Semi-supervised graph clustering: a kernel approach , 2005, Machine Learning.

[31]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[32]  Ivor W. Tsang,et al.  Tighter and Convex Maximum Margin Clustering , 2009, AISTATS.

[33]  Miguel Á. Carreira-Perpiñán,et al.  Constrained spectral clustering through affinity propagation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[35]  Misha Pavel,et al.  Adjustment Learning and Relevant Component Analysis , 2002, ECCV.

[36]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.