Two-Step Greedy Subspace Clustering

Greedy subspace clustering methods provide an efficient way to cluster large-scale multimedia datasets. However, these methods do not guarantee a global optimum and their clustering performance mainly depends on their initializations. To alleviate this initialization problem, this paper proposes a two-step greedy strategy by exploring proper neighbors that span an initial subspace. Firstly, for each data point, we seek a sparse representation with respect to its nearest neighbors. The data points corresponding to nonzero entries in the learning representation form an initial subspace, which potentially rejects bad or redundant data points. Secondly, the subspace is updated by adding an orthogonal basis involved with the newly added data points. Experimental results on real-world applications demonstrate that our method can significantly improve the clustering accuracy of greedy subspace clustering methods without scarifying much computational time.

[1]  René Vidal,et al.  A Benchmark for the Comparison of 3-D Motion Segmentation Algorithms , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Tieniu Tan,et al.  Robust Low-Rank Representation via Correntropy , 2013, 2013 2nd IAPR Asian Conference on Pattern Recognition.

[3]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[4]  Allen Y. Yang,et al.  Robust Statistical Estimation and Segmentation of Multiple Subspaces , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[5]  Allen Y. Yang,et al.  Unsupervised segmentation of natural images via lossy data compression , 2008, Comput. Vis. Image Underst..

[6]  David J. Kriegman,et al.  Clustering appearances of objects under varying illumination conditions , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  Helmut Bölcskei,et al.  Subspace clustering via thresholding and spectral clustering , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  René Vidal,et al.  Subspace Clustering , 2011, IEEE Signal Processing Magazine.

[9]  S. Shankar Sastry,et al.  Generalized principal component analysis (GPCA) , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Takeo Kanade,et al.  A Multibody Factorization Method for Independently Moving Objects , 1998, International Journal of Computer Vision.

[11]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Xuran Zhao,et al.  A subspace co-training framework for multi-view clustering , 2014, Pattern Recognit. Lett..

[13]  Gilad Lerman,et al.  Median K-Flats for hybrid linear modeling with many outliers , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[14]  Kenichi Kanatani,et al.  Motion segmentation by subspace separation and model selection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[15]  Aswin C. Sankaranarayanan,et al.  Greedy feature selection for subspace clustering , 2013, J. Mach. Learn. Res..

[16]  Tieniu Tan,et al.  Robust Subspace Clustering via Half-Quadratic Minimization , 2013, 2013 IEEE International Conference on Computer Vision.

[17]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[18]  Constantine Caramanis,et al.  Greedy Subspace Clustering , 2014, NIPS.

[19]  René Vidal,et al.  Sparse subspace clustering , 2009, CVPR.

[20]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[21]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[22]  Sham M. Kakade,et al.  Multi-view clustering via canonical correlation analysis , 2009, ICML '09.

[23]  Kun Huang,et al.  Multiscale Hybrid Linear Models for Lossy Image Representation , 2006, IEEE Transactions on Image Processing.

[24]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  René Vidal,et al.  Multiframe Motion Segmentation with Missing Data Using PowerFactorization and GPCA , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[26]  Guangliang Chen,et al.  Spectral Curvature Clustering (SCC) , 2009, International Journal of Computer Vision.