Compressive PCA on Graphs

Randomized algorithms reduce the complexity of low-rank recovery methods only w.r.t dimension p of a big dataset Y ∈ p×n. However, the case of large n is cumbersome to tackle without sacrificing the recovery. The recently introduced Fast Robust PCA on Graphs (FRPCAG) approximates a recovery method for matrices which are low-rank on graphs constructed between their rows and columns. In this paper we provide a novel framework, Compressive PCA on Graphs (CPCA) for an approximate recovery of such data matrices from sampled measurements. We introduce a RIP condition for low-rank matrices on graphs which enables efficient sampling of the rows and columns to perform FRPCAG on the sampled matrix. Several efficient, parallel and parameter-free decoders are presented along with their theoretical analysis for the low-rank recovery and clustering applications of PCA. On a single core machine, CPCA gains a speed up of p/k over FRPCAG, where k p is the sub-space dimension. Numerically, CPCA can efficiently cluster 70,000 MNIST digits in less than a minute and recover a low-rank matrix of size 10304×1000 in 15 secs, which is 6 and 100 times faster than FRPCAG and exact recovery.

[1]  Charles Elkan,et al.  Using the Triangle Inequality to Accelerate k-Means , 2003, ICML.

[2]  P. Vandergheynst,et al.  Accelerated filtering on graphs using Lanczos method , 2015, 1509.04537.

[3]  Liang-Tien Chia,et al.  Laplacian Sparse Coding, Hypergraph Laplacian Sparse Coding, and Applications , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Thomas S. Huang,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation. , 2011, IEEE transactions on pattern analysis and machine intelligence.

[5]  Hanan Samet,et al.  A fast all nearest neighbor algorithm for applications involving large point-clouds , 2007, Comput. Graph..

[6]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[7]  Nathanael Perraudin,et al.  Fast Robust PCA on Graphs , 2015, IEEE Journal of Selected Topics in Signal Processing.

[8]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[9]  Santiago Segarra,et al.  Network topology identification from spectral templates , 2016, 2016 IEEE Statistical Signal Processing Workshop (SSP).

[10]  Jin Tang,et al.  Graph-Laplacian PCA: Closed-Form Solution and Robustness , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Christos Boutsidis,et al.  An improved approximation algorithm for the column subset selection problem , 2008, SODA.

[12]  Xavier Bresson,et al.  Robust Principal Component Analysis on Graphs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Jarvis D. Haupt,et al.  Identifying Outliers in Large Matrices via Randomized Adaptive Compressive Sampling , 2014, IEEE Transactions on Signal Processing.

[14]  Pierre Vandergheynst,et al.  Stationary Signal Processing on Graphs , 2016, IEEE Transactions on Signal Processing.

[15]  Emmanuel J. Candès,et al.  Randomized Algorithms for Low-Rank Matrix Factorizations: Sharp Performance Bounds , 2013, Algorithmica.

[16]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[17]  Wooseok Ha,et al.  Robust PCA with compressed data , 2015, NIPS.

[18]  Tae-Hyun Oh Fast Randomized Singular Value Thresholding for Nuclear Norm Minimization , 2015 .

[19]  Volkan Cevher,et al.  A variational approach to stable principal component pursuit , 2014, UAI.

[20]  George Atia,et al.  Randomized robust subspace recovery for big data , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).

[21]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[22]  Jane You,et al.  Low-rank matrix factorization with multiple Hypergraph regularizer , 2015, Pattern Recognit..

[23]  Pierre Vandergheynst,et al.  Compressive Spectral Clustering , 2016, ICML.

[24]  Richard G. Baraniuk,et al.  Signal Processing With Compressive Measurements , 2010, IEEE Journal of Selected Topics in Signal Processing.

[25]  Florian Dörfler,et al.  Kron Reduction of Graphs With Applications to Electrical Networks , 2011, IEEE Transactions on Circuits and Systems I: Regular Papers.

[26]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[27]  Ameet Talwalkar,et al.  Matrix Coherence and the Nystrom Method , 2010, UAI.

[28]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  George Atia,et al.  High Dimensional Low Rank Plus Sparse Matrix Decomposition , 2015, IEEE Transactions on Signal Processing.

[30]  J. Tropp On the conditioning of random subdictionaries , 2008 .

[31]  Pierre Vandergheynst,et al.  UNLocBoX A matlab convex optimization toolbox using proximal splitting methods , 2014, ArXiv.

[32]  Pierre Vandergheynst,et al.  GSPBOX: A toolbox for signal processing on graphs , 2014, ArXiv.

[33]  Pierre Vandergheynst,et al.  Random sampling of bandlimited signals on graphs , 2015, NIPS 2015.

[34]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[35]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[37]  Zhenyue Zhang,et al.  Low-Rank Matrix Approximation with Manifold Regularization , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.