DIFFRAC: a discriminative and flexible framework for clustering

We present a novel linear clustering framework (DIFFRAC) which relies on a linear discriminative cost function and a convex relaxation of a combinatorial optimization problem. The large convex optimization problem is solved through a sequence of lower dimensional singular value decompositions. This framework has several attractive properties: (1) although apparently similar to K-means, it exhibits superior clustering performance than K-means, in particular in terms of robustness to noise. (2) It can be readily extended to non linear clustering if the discriminative cost function is based on positive definite kernels, and can then be seen as an alternative to spectral clustering. (3) Prior information on the partition is easily incorporated, leading to state-of-the-art performance for semi-supervised learning, for clustering or classification. We present empirical evaluations of our algorithms on synthetic and real medium-scale datasets.

[1]  Alan M. Frieze,et al.  Improved Approximation Algorithms for MAX k-CUT and MAX BISECTION , 1995, IPCO.

[2]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[3]  Chris H. Q. Ding,et al.  Spectral Relaxation for K-means Clustering , 2001, NIPS.

[4]  Adrian S. Lewis,et al.  Twice Differentiable Spectral Functions , 2001, SIAM J. Matrix Anal. Appl..

[5]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[6]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[7]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[8]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[9]  Jean Charles Gilbert,et al.  Numerical Optimization: Theoretical and Practical Aspects , 2003 .

[10]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[11]  Dale Schuurmans,et al.  Maximum Margin Clustering , 2004, NIPS.

[12]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[13]  Chaitanya Swamy,et al.  Correlation Clustering: maximizing agreements via semidefinite programming , 2004, SODA '04.

[14]  Christoph Schnörr,et al.  Semidefinite Clustering for Image Segmentation with A-priori Knowledge , 2005, DAGM-Symposium.

[15]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[16]  Dale Schuurmans,et al.  Unsupervised and Semi-Supervised Multi-Class Support Vector Machines , 2005, AAAI.

[17]  Michael I. Jordan,et al.  Learning Spectral Clustering, With Application To Speech Separation , 2006, J. Mach. Learn. Res..

[18]  Nello Cristianini,et al.  Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problem , 2006, J. Mach. Learn. Res..

[19]  Gregory Shakhnarovich,et al.  An investigation of computational and informational limits in Gaussian mixture clustering , 2006, ICML '06.

[20]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[21]  J. Frédéric Bonnans,et al.  Numerical Optimization: Theoretical and Practical Aspects (Universitext) , 2006 .

[22]  Rong Jin,et al.  Generalized Maximum Margin Clustering and Unsupervised Kernel Learning , 2006, NIPS.

[23]  Ivor W. Tsang,et al.  Maximum Margin Clustering Made Practical , 2009, IEEE Trans. Neural Networks.