A Convex Approach to K-Means Clustering and Image Segmentation

A new convex formulation of data clustering and image segmentation is proposed, with fixed number K of regions and possible penalization of the region perimeters. So, this problem is a spatially regularized version of the K-means problem, a.k.a. piecewise constant Mumford–Shah problem. The proposed approach relies on a discretization of the search space; that is, a finite number of candidates must be specified, from which the K centroids are determined. After reformulation as an assignment problem, a convex relaxation is proposed, which involves a kind of \(l_{1,\infty }\) norm ball. A splitting of it is proposed, so as to avoid the costly projection onto this set. Some examples illustrate the efficiency of the approach.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  Jiming Peng,et al.  Advanced Optimization Laboratory Title : Approximating K-means-type clustering via semidefinite programming , 2005 .

[3]  J. Reese,et al.  Solution methods for the p-median problem: An annotated bibliography , 2006 .

[4]  Jiming Peng,et al.  A new theoretical framework for K-means-type clustering , 2004 .

[5]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[6]  Guojun Gan,et al.  Data Clustering: Theory, Algorithms, and Applications (ASA-SIAM Series on Statistics and Applied Probability) , 2007 .

[7]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[8]  J. Suykens,et al.  Convex Clustering Shrinkage , 2005 .

[9]  Shuicheng Yan,et al.  Convex Optimization Procedure for Clustering: Theoretical Revisit , 2014, NIPS.

[10]  Shi Li,et al.  Approximating k-median via pseudo-approximation , 2012, STOC '13.

[11]  M. J. van der Laan,et al.  A new partitioning around medoids algorithm , 2003 .

[12]  Laurent Condat,et al.  A Fast Projection onto the Simplex and the l 1 Ball , 2015 .

[13]  Xiaolin Wu,et al.  Optimal Quantization by Matrix Searching , 1991, J. Algorithms.

[14]  Xue-Cheng Tai,et al.  A Continuous Max-Flow Approach to Minimal Partitions with Label Cost Prior , 2011, SSVM.

[15]  Xue-Cheng Tai,et al.  A Continuous Max-Flow Approach to Potts Model , 2010, ECCV.

[16]  M. Emre Celebi,et al.  Improving the performance of k-means for color quantization , 2011, Image Vis. Comput..

[17]  D. Mumford,et al.  Optimal approximations by piecewise smooth functions and associated variational problems , 1989 .

[18]  Daniel Cremers,et al.  Global Solutions of Variational Models with Convex Regularization , 2010, SIAM J. Imaging Sci..

[19]  L. Ljung,et al.  Clustering using sum-of-norms regularization: With application to particle filter output computation , 2011, 2011 IEEE Statistical Signal Processing Workshop (SSP).

[20]  Nelly Pustelnik,et al.  Proximity Operator of a Sum of Functions; Application to Depth Map Estimation , 2017, IEEE Signal Processing Letters.

[21]  Ravishankar Krishnaswamy,et al.  Relax, No Need to Round: Integrality of Clustering Formulations , 2014, ITCS.

[22]  Jianhong Wu,et al.  Data clustering - theory, algorithms, and applications , 2007 .

[23]  Marc Pollefeys,et al.  What is optimized in convex relaxations for multilabel problems: connecting discrete and continuously inspired MAP inference. , 2014, IEEE transactions on pattern analysis and machine intelligence.

[24]  Pierre Hansen,et al.  NP-hardness of Euclidean sum-of-squares clustering , 2008, Machine Learning.

[25]  Heinz H. Bauschke,et al.  Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.

[26]  Xavier Bresson,et al.  Completely Convex Formulation of the Chan-Vese Image Segmentation Model , 2012, International Journal of Computer Vision.

[27]  Trevor Darrell,et al.  An efficient projection for l1, ∞ regularization , 2009, ICML '09.

[28]  Laurent Condat,et al.  Discrete Total Variation: New Definition and Minimization , 2017, SIAM J. Imaging Sci..

[29]  Eric C. Chi,et al.  Splitting Methods for Convex Clustering , 2013, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[30]  Laurent Condat,et al.  A Primal–Dual Splitting Method for Convex Optimization Involving Lipschitzian, Proximable and Linear Composite Terms , 2012, Journal of Optimization Theory and Applications.

[31]  Daniel Cremers,et al.  A Convex Approach to Minimal Partitions , 2012, SIAM J. Imaging Sci..

[32]  N. Sloane,et al.  The Optimal Lattice Quantizer in Three Dimensions , 1983 .

[33]  Meena Mahajan,et al.  The Planar k-means Problem is NP-hard I , 2009 .

[34]  Tony F. Chan,et al.  Mumford and Shah Model and Its Applications to Image Segmentation and Image Restoration , 2015, Handbook of Mathematical Methods in Imaging.

[35]  Rachid Deriche,et al.  A Review of Statistical Approaches to Level Set Segmentation: Integrating Color, Texture, Motion and Shape , 2007, International Journal of Computer Vision.

[36]  Biing-Hwang Juang,et al.  Optimal quantization of LSP parameters , 1993, IEEE Trans. Speech Audio Process..

[37]  Francis R. Bach,et al.  Clusterpath: an Algorithm for Clustering using Convex Fusion Penalties , 2011, ICML.

[38]  Xuecheng Tai,et al.  Simultaneous Convex Optimization of Regions and Region Parameters in Image Segmentation Models , 2013, Innovations for Shape Analysis, Models and Algorithms.

[39]  Xue-Cheng Tai,et al.  Efficient Global Minimization Methods for Image Segmentation Models with Four Regions , 2014, Journal of Mathematical Imaging and Vision.

[40]  Douglas Steinley,et al.  K-means clustering: a half-century synthesis. , 2006, The British journal of mathematical and statistical psychology.