A Refined Approximation for Euclidean k-Means

In the Euclidean k-Means problem we are given a collection of n points D in an Euclidean space and a positive integer k. Our goal is to identify a collection of k points in the same space (centers) so as to minimize the sum of the squared Euclidean distances between each point in D and the closest center. This problem is known to be APX-hard and the current best approximation ratio is a primal-dual 6.357 approximation based on a standard LP for the problem [Ahmadian et al. FOCS’17, SICOMP’20]. In this note we show how a minor modification of Ahmadian et al.’s analysis leads to a slightly improved 6.12903 approximation. As a related result, we also show that the mentioned LP has integrality gap at least 16+ √

[1]  Marek Karpinski,et al.  Approximation schemes for clustering problems , 2003, STOC '03.

[2]  Avrim Blum,et al.  Stability Yields a PTAS for k-Median and k-Means Clustering , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[3]  Fabrizio Grandoni,et al.  Oblivious dimension reduction for k-means: beyond subspaces and the Johnson-Lindenstrauss lemma , 2019, STOC.

[4]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[5]  R. Ostrovsky,et al.  The Effectiveness of Lloyd-Type Methods for the k-Means Problem , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[6]  Ola Svensson,et al.  Better Guarantees for k-Means and Euclidean k-Median by Primal-Dual Algorithms , 2016, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[7]  Maria-Florina Balcan,et al.  Approximate clustering without the approximation , 2009, SODA.

[8]  Ravishankar Krishnaswamy,et al.  The Hardness of Approximation of Euclidean k-Means , 2015, SoCG.

[9]  Vijay V. Vazirani,et al.  Approximation algorithms for metric facility location and k-Median problems using the primal-dual schema and Lagrangian relaxation , 2001, JACM.

[10]  Sergei Vassilvitskii,et al.  How slow is the k-means method? , 2006, SCG '06.

[11]  Konstantin Makarychev,et al.  Performance of Johnson-Lindenstrauss transform for k-means and k-medians clustering , 2018, STOC.

[12]  David M. Mount,et al.  A local search approximation algorithm for k-means clustering , 2002, SCG '02.

[13]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[14]  J. Matou On Approximate Geometric K-clustering , 1999 .

[15]  Andrea Vattani,et al.  k-means Requires Exponentially Many Iterations Even in the Plane , 2008, SCG '09.

[16]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[17]  Euiwoong Lee,et al.  Improved and simplified inapproximability for k-means , 2015, Inf. Process. Lett..

[18]  S. KarthikC.,et al.  Inapproximability of Clustering in Lp Metrics , 2019, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).