Unmixing Incoherent Structures of Big Data by Randomized or Greedy Decomposition

Learning big data by matrix decomposition always suffers from expensive computation, mixing of complicated structures and noise. In this paper, we study more adaptive models and efficient algorithms that decompose a data matrix as the sum of semantic components with incoherent structures. We firstly introduce "GO decomposition (GoDec)", an alternating projection method estimating the low-rank part $L$ and the sparse part $S$ from data matrix $X=L+S+G$ corrupted by noise $G$. Two acceleration strategies are proposed to obtain scalable unmixing algorithm on big data: 1) Bilateral random projection (BRP) is developed to speed up the update of $L$ in GoDec by a closed-form built from left and right random projections of $X-S$ in lower dimensions; 2) Greedy bilateral (GreB) paradigm updates the left and right factors of $L$ in a mutually adaptive and greedy incremental manner, and achieve significant improvement in both time and sample complexities. Then we proposes three nontrivial variants of GoDec that generalizes GoDec to more general data type and whose fast algorithms can be derived from the two strategies......

[1]  Mohamed-Jalal Fadili,et al.  Morphological Component Analysis: An Adaptive Thresholding Strategy , 2007, IEEE Transactions on Image Processing.

[2]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[3]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[4]  Santosh S. Vempala,et al.  The Random Projection Method , 2005, DIMACS Series in Discrete Mathematics and Theoretical Computer Science.

[5]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[6]  Eyke Hüllermeier,et al.  Combining instance-based learning and logistic regression for multilabel classification , 2009, Machine Learning.

[7]  Xi Chen,et al.  Direct Robust Matrix Factorizatoin for Anomaly Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[8]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[9]  Bernard Chazelle,et al.  Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform , 2006, STOC '06.

[10]  Jieping Ye,et al.  Generalized Low Rank Approximations of Matrices , 2004, Machine Learning.

[11]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[12]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[13]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[14]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[15]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[16]  Dacheng Tao,et al.  Bilateral random projections , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[17]  Mubarak Shah,et al.  A Lagrangian Particle Dynamics Approach for Crowd Flow Segmentation and Stability Analysis , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Claudio Gentile,et al.  Incremental Algorithms for Hierarchical Classification , 2004, J. Mach. Learn. Res..

[19]  Sham M. Kakade,et al.  Robust Matrix Decomposition With Sparse Corruptions , 2011, IEEE Transactions on Information Theory.

[20]  Xuelong Li,et al.  Patch Alignment for Dimensionality Reduction , 2009, IEEE Transactions on Knowledge and Data Engineering.

[21]  David P. Woodruff,et al.  Numerical linear algebra in the streaming model , 2009, STOC '09.

[22]  Mubarak Shah,et al.  Action recognition in videos acquired by a moving camera using motion decomposition of Lagrangian particle trajectories , 2011, 2011 International Conference on Computer Vision.

[23]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[24]  Joseph F. Murray,et al.  Dictionary Learning Algorithms for Sparse Representation , 2003, Neural Computation.

[25]  Dale Schuurmans,et al.  Real-Time Discriminative Background Subtraction , 2011, IEEE Transactions on Image Processing.

[26]  Dacheng Tao,et al.  Multi-label Subspace Ensemble , 2012, AISTATS.

[27]  Grigorios Tsoumakas,et al.  Effective and Efficient Multilabel Classification in Domains with Large Number of Labels , 2008 .

[28]  Dave Zachariah,et al.  Alternating Least-Squares for Low-Rank Matrix Reconstruction , 2012, IEEE Signal Processing Letters.

[29]  ZhouZhi-Hua,et al.  Multilabel dimensionality reduction via dependence maximization , 2010 .

[30]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[31]  Inderjit S. Dhillon,et al.  Guaranteed Rank Minimization via Singular Value Projection , 2009, NIPS.

[32]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[33]  Alan Fern,et al.  Discriminatively trained particle filters for complex multi-object tracking , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[35]  Yin Zhang,et al.  Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm , 2012, Mathematical Programming Computation.

[36]  Jieping Ye,et al.  A shared-subspace learning framework for multi-label classification , 2010, TKDD.

[37]  Sam T. Roweis,et al.  EM Algorithms for PCA and SPCA , 1997, NIPS.

[38]  Emmanuel J. Candès,et al.  Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies? , 2004, IEEE Transactions on Information Theory.

[39]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[40]  Simon J. D. Prince,et al.  Computer Vision: Models, Learning, and Inference , 2012 .

[41]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[42]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[43]  Massimo Fornasier,et al.  Low-rank Matrix Recovery via Iteratively Reweighted Least Squares Minimization , 2010, SIAM J. Optim..

[44]  R. Muirhead Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[45]  Katerina Fragkiadaki,et al.  Detection free tracking: Exploiting motion and topology for segmenting and tracking under entanglement , 2011, CVPR 2011.

[46]  Xindong Wu,et al.  Manifold elastic net: a unified framework for sparse dimension reduction , 2010, Data Mining and Knowledge Discovery.

[47]  Jieping Ye,et al.  Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks , 2010, TKDD.

[48]  Sewoong Oh,et al.  A Gradient Descent Algorithm on the Grassman Manifold for Matrix Completion , 2009, ArXiv.

[49]  Shuiwang Ji,et al.  SLEP: Sparse Learning with Efficient Projections , 2011 .

[50]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[51]  Adrian S. Lewis,et al.  Alternating Projections on Manifolds , 2008, Math. Oper. Res..

[52]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[53]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[54]  Dacheng Tao,et al.  Shifted Subspaces Tracking on Sparse Outlier for Motion Segmentation , 2013, IJCAI.

[55]  Dacheng Tao,et al.  GoDec: Randomized Lowrank & Sparse Matrix Decomposition in Noisy Case , 2011, ICML.

[56]  Dirk A. Lorenz,et al.  Iterated Hard Shrinkage for Minimization Problems with Sparsity Constraints , 2008, SIAM J. Sci. Comput..

[57]  Larry S. Davis,et al.  Fast multiple object tracking via a hierarchical particle filter , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[58]  Xiaodong Li,et al.  Stable Principal Component Pursuit , 2010, 2010 IEEE International Symposium on Information Theory.

[59]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[60]  Tibério S. Caetano,et al.  Reverse Multi-Label Learning , 2010, NIPS.

[61]  Ohad Shamir,et al.  Large-Scale Convex Minimization with a Low-Rank Constraint , 2011, ICML.

[62]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.