Oracle Based Active Set Algorithm for Scalable Elastic Net Subspace Clustering

State-of-the-art subspace clustering methods are based on expressing each data point as a linear combination of other data points while regularizing the matrix of coefficients with ℓ<sub>1</sub>, ℓ<sub>2</sub> or nuclear norms. ℓ<sub>1</sub> regularization is guaranteed to give a subspace-preserving affinity (i.e., there are no connections between points from different subspaces) under broad theoretical conditions, but the clusters may not be connected. ℓ<sub>2</sub> and nuclear norm regularization often improve connectivity, but give a subspace-preserving affinity only for independent subspaces. Mixed ℓ<sub>1</sub>, ℓ<sub>2</sub> and nuclear norm regularizations offer a balance between the subspace-preserving and connectedness properties, but this comes at the cost of increased computational complexity. This paper studies the geometry of the elastic net regularizer (a mixture of the ℓ<sub>1</sub> and ℓ<sub>2</sub> norms) and uses it to derive a provably correct and scalable active set method for finding the optimal coefficients. Our geometric analysis also provides a theoretical justification and a geometric interpretation for the balance between the connectedness (due to ℓ<sub>2</sub> regularization) and subspace-preserving (due to ℓ<sub>1</sub> regularization) properties for elastic net subspace clustering. Our experiments show that the proposed active set method not only achieves state-of-the-art clustering performance, but also efficiently handles large-scale datasets.

[1]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[2]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[3]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[4]  David J. Kriegman,et al.  Clustering appearances of objects under varying illumination conditions , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[5]  René Vidal,et al.  Multiframe Motion Segmentation with Missing Data Using PowerFactorization and GPCA , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[6]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[7]  Takeo Kanade,et al.  A Multibody Factorization Method for Independently Moving Objects , 1998, International Journal of Computer Vision.

[8]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[9]  S. Shankar Sastry,et al.  Generalized principal component analysis (GPCA) , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Kun Huang,et al.  Multiscale Hybrid Linear Models for Lossy Image Representation , 2006, IEEE Transactions on Image Processing.

[11]  Mário A. T. Figueiredo,et al.  Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems , 2007, IEEE Journal of Selected Topics in Signal Processing.

[12]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[13]  Guangliang Chen,et al.  Spectral Curvature Clustering (SCC) , 2009, International Journal of Computer Vision.

[14]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[15]  D. Lorenz,et al.  Elastic-net regularization: error estimates and active set methods , 2009, 0905.0796.

[16]  Lorenzo Rosasco,et al.  Elastic-net regularization in learning theory , 2008, J. Complex..

[17]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[18]  Ehsan Elhamifar,et al.  Sparse subspace clustering , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[20]  René Vidal,et al.  Motion Segmentation in the Presence of Outlying, Incomplete, or Corrupted Trajectories , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Zhixun Su,et al.  Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation , 2011, NIPS.

[22]  Francis R. Bach,et al.  Trace Lasso: a trace norm regularization for correlated designs , 2011, NIPS.

[23]  Richard I. Hartley,et al.  Graph connectivity in sparse subspace clustering , 2011, CVPR 2011.

[24]  René Vidal,et al.  A closed form solution to robust subspace estimation and clustering , 2011, CVPR 2011.

[25]  Xinlei Chen,et al.  Large Scale Spectral Clustering with Landmark-Based Representation , 2011, AAAI.

[26]  Emmanuel J. Candès,et al.  A Geometric Analysis of Subspace Clustering with Outliers , 2011, ArXiv.

[27]  Shuicheng Yan,et al.  Robust and Efficient Subspace Segmentation via Least Squares Regression , 2012, ECCV.

[28]  Nathan Srebro,et al.  Sparse Prediction with the $k$-Support Norm , 2012, NIPS.

[29]  Bin Dai,et al.  Graph-Oriented Learning via Automatic Group Sparsity for Data Analysis , 2012, 2012 IEEE 12th International Conference on Data Mining.

[30]  Gilad Lerman,et al.  Hybrid Linear Modeling via Local Best-Fit Flats , 2010, International Journal of Computer Vision.

[31]  Hans-Peter Kriegel,et al.  Subspace clustering , 2012, WIREs Data Mining Knowl. Discov..

[32]  Stéphane Mallat,et al.  Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.

[33]  Shuicheng Yan,et al.  Correlation Adaptive Subspace Segmentation by Trace Lasso , 2013, 2013 IEEE International Conference on Computer Vision.

[34]  Huan Xu,et al.  Noisy Sparse Subspace Clustering , 2013, J. Mach. Learn. Res..

[35]  Aswin C. Sankaranarayanan,et al.  Greedy feature selection for subspace clustering , 2013, J. Mach. Learn. Res..

[36]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[37]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Zhang Yi,et al.  Scalable Sparse Subspace Clustering , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Constantine Kotropoulos,et al.  Elastic Net subspace clustering applied to pop/rock music structure analysis , 2014, Pattern Recognit. Lett..

[40]  Constantine Caramanis,et al.  Greedy Subspace Clustering , 2014, NIPS.

[41]  Yong Tang,et al.  Efficient k-Support Matrix Pursuit , 2014, ECCV.

[42]  René Vidal,et al.  Low rank subspace clustering (LRSC) , 2014, Pattern Recognit. Lett..

[43]  Helmut Bölcskei,et al.  Robust Subspace Clustering via Thresholding , 2013, IEEE Transactions on Information Theory.

[44]  René Vidal,et al.  Subspace-Sparse Representation , 2015, ArXiv.

[45]  René Vidal,et al.  Geometric Conditions for Subspace-Sparse Recovery , 2015, ICML.

[46]  Xindong Wu,et al.  Graph-Based Learning via Auto-Grouped Sparse Regularization and Kernelized Extension , 2015, IEEE Transactions on Knowledge and Data Engineering.

[47]  René Vidal,et al.  Structured Sparse Subspace Clustering: A unified optimization framework , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  René Vidal,et al.  Filtrated Spectral Algebraic Subspace Clustering , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[49]  S. Shankar Sastry,et al.  Generalized Principal Component Analysis , 2016, Interdisciplinary applied mathematics.

[50]  Daniel P. Robinson,et al.  Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).