The Complexity of the k-means Method

The k-means method is a widely used technique for clustering points in Euclidean space. While it is extremely fast in practice, its worst-case running time is exponential in the number of data points. We prove that the k-means method can implicitly solve PSPACE-complete problems, providing a complexity-theoretic explanation for its worst-case running time. Our result parallels recent work on the complexity of the simplex method for linear programming.

[1]  Andrea Vattani,et al.  k-means Requires Exponentially Many Iterations Even in the Plane , 2008, SCG '09.

[2]  Norman Zadeh,et al.  A bad network problem for the simplex method and other minimum cost flow algorithms , 1973, Math. Program..

[3]  Sergei Vassilvitskii,et al.  How slow is the k-means method? , 2006, SCG '06.

[4]  Meena Mahajan,et al.  The Planar k-means Problem is NP-hard I , 2009 .

[5]  V. Klee,et al.  HOW GOOD IS THE SIMPLEX ALGORITHM , 1970 .

[6]  Nir Ailon,et al.  A Tight Lower Bound Instance for k-means++ in Constant Dimension , 2014, TAMC.

[7]  Mihalis Yannakakis,et al.  How easy is local search? , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[8]  Paul W. Goldberg,et al.  The Complexity of the Homotopy Method, Equilibrium Selection, and Lemke-Howson Solutions , 2010, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[9]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[10]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[11]  Bodo Manthey,et al.  k-Means Has Polynomial Smoothed Complexity , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.

[12]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[13]  John Fearnley,et al.  The Complexity of the Simplex Method , 2015, STOC.

[14]  N. Amenta,et al.  Deformed products and maximal shadows of polytopes , 1996 .

[15]  Christos H. Papadimitriou,et al.  On Simplex Pivoting Rules and Complexity Theory , 2014, IPCO.

[16]  David M. Mount,et al.  A local search approximation algorithm for k-means clustering , 2002, SCG '02.

[17]  Heiko Röglin,et al.  A bad instance for k-means++ , 2013, Theor. Comput. Sci..

[18]  Shang-Hua Teng,et al.  Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time , 2001, STOC '01.

[19]  Martin Skutella,et al.  The Simplex Algorithm is NP-mighty , 2015, SODA.