Maximum Volume Clustering

The large volume principle proposed by Vladimir Vapnik, which advocates that hypotheses lying in an equivalence class with a larger volume are more preferable, is a useful alternative to the large margin principle. In this paper, we introduce a clustering model based on the large volume principle called maximum volume clustering (MVC), and propose two algorithms to solve it approximately: a soft-label and a hard-label MVC algorithms based on sequential quadratic programming and semi-denite programming, respectively. Our MVC model includes spectral clustering and maximum margin clustering as special cases, and is substantially more general. We also establish the nite sample stability and an error bound for soft-label MVC method. Experiments show that the proposed MVC approach compares favorably with state-of-the-art clustering algorithms.

[1]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[2]  Dale Schuurmans,et al.  Unsupervised and Semi-Supervised Multi-Class Support Vector Machines , 2005, AAAI.

[3]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[4]  Dale Schuurmans,et al.  Maximum Margin Clustering , 2004, NIPS.

[5]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Nello Cristianini,et al.  Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problem , 2006, J. Mach. Learn. Res..

[7]  Paul T. Boggs,et al.  Sequential Quadratic Programming , 1995, Acta Numerica.

[8]  Prabhakar Raghavan,et al.  Randomized rounding: A technique for provably good algorithms and algorithmic proofs , 1985, Comb..

[9]  Fei Wang,et al.  Efficient multiclass maximum margin clustering , 2008, ICML '08.

[10]  Ran El-Yaniv,et al.  Large Margin vs. Large Volume in Transductive Learning , 2008, ECML/PKDD.

[11]  Ran El-Yaniv,et al.  Transductive Rademacher Complexity and Its Applications , 2007, COLT.

[12]  Rong Jin,et al.  Generalized Maximum Margin Clustering and Unsupervised Kernel Learning , 2006, NIPS.

[13]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[14]  Ran El-Yaniv,et al.  Transductive Rademacher Complexity and Its Applications , 2007, COLT.

[15]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[16]  Jianbo Shi,et al.  A Random Walks View of Spectral Segmentation , 2001, AISTATS.

[17]  Ivor W. Tsang,et al.  Tighter and Convex Maximum Margin Clustering , 2009, AISTATS.

[18]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[19]  Fei Wang,et al.  Efficient Maximum Margin Clustering via Cutting Plane Algorithm , 2008, SDM.

[20]  Nello Cristianini,et al.  Convex Methods for Transduction , 2003, NIPS.

[21]  Ivor W. Tsang,et al.  Maximum Margin Clustering Made Practical , 2009, IEEE Trans. Neural Networks.

[22]  Isabelle Guyon,et al.  Clustering: Science or Art? , 2009, ICML Unsupervised and Transfer Learning.