Faster coreset construction for subspace and projective clustering

We present randomized coreset constructions for subspace and projective clustering. If $A$ is the input matrix, then our construction relies on projecting A on the approximation of the first few right singular vectors of $A$. We are able to achieve a faster algorithm, as compare to the corresponding deterministic algorithm for the same problem by Fledman et. al.[9], while maintaining a desired accuracy. We also complement our theoretical result by supporting experiments.

[1]  Sariel Har-Peled,et al.  No, Coreset, No Cry , 2004, FSTTCS.

[2]  David P. Woodruff,et al.  Coresets and sketches for high dimensional subspace approximation problems , 2010, SODA '10.

[3]  Santosh S. Vempala,et al.  Adaptive Sampling and Fast Low-Rank Matrix Approximation , 2006, APPROX-RANDOM.

[4]  Michael W. Mahoney Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..

[5]  Michael B. Cohen,et al.  Dimensionality Reduction for k-Means Clustering and Low Rank Approximation , 2014, STOC.

[6]  Michael W. Mahoney Boyd,et al.  Randomized Algorithms for Matrices and Data , 2010 .

[7]  Kasturi R. Varadarajan,et al.  Geometric Approximation via Coresets , 2007 .

[8]  Xin Xiao,et al.  A near-linear algorithm for projective clustering integer points , 2012, SODA.

[9]  Sariel Har-Peled,et al.  On coresets for k-means and k-median clustering , 2004, STOC '04.

[10]  Pankaj K. Agarwal,et al.  Approximating extent measures of points , 2004, JACM.

[11]  Dan Feldman,et al.  Turning big data into tiny data: Constant-size coresets for k-means, PCA and projective clustering , 2013, SODA.

[12]  Michael Langberg,et al.  A unified framework for approximating and clustering data , 2011, STOC.

[13]  Santosh S. Vempala,et al.  Matrix approximation and projective clustering via volume sampling , 2006, SODA '06.

[14]  Amos Fiat,et al.  Coresets forWeighted Facilities and Their Applications , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[15]  Tamás Sarlós,et al.  Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).