Learning overcomplete representations from distributed data: a brief review

Most of the research on dictionary learning has focused on developing algorithms under the assumption that data is available at a centralized location. But often the data is not available at a centralized location due to practical constraints like data aggregation costs, privacy concerns, etc. Using centralized dictionary learning algorithms may not be the optimal choice in such settings. This motivates the design of dictionary learning algorithms that consider distributed nature of data as one of the problem variables. Just like centralized settings, distributed dictionary learning problem can be posed in more than one way depending on the problem setup. Most notable distinguishing features are the online versus batch nature of data and the representative versus discriminative nature of the dictionaries. In this paper, several distributed dictionary learning algorithms that are designed to tackle different problem setups are reviewed. One of these algorithms is cloud K-SVD, which solves the dictionary learning problem for batch data in distributed settings. One distinguishing feature of cloud K-SVD is that it has been shown to converge to its centralized counterpart, namely, the K-SVD solution. On the other hand, no such guarantees are provided for other distributed dictionary learning algorithms. Convergence of cloud K-SVD to the centralized K-SVD solution means problems that are solvable by K-SVD in centralized settings can now be solved in distributed settings with similar performance. Finally, cloud K-SVD is used as an example to show the advantages that are attainable by deploying distributed dictionary algorithms for real world distributed datasets.

[1]  Ioannis D. Schizas,et al.  Distributed Recursive Least-Squares for Consensus-Based In-Network Adaptive Estimation , 2009, IEEE Transactions on Signal Processing.

[2]  Anna Scaglione,et al.  A consensus-based decentralized algorithm for non-convex optimization with application to dictionary learning , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Joel A. Tropp,et al.  Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit , 2007, IEEE Transactions on Information Theory.

[4]  Stephen P. Boyd,et al.  Fast linear iterations for distributed averaging , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[5]  David Kempe,et al.  A decentralized algorithm for spectral analysis , 2004, STOC '04.

[6]  Ali H. Sayed,et al.  Online dictionary learning over distributed models , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Sergios Theodoridis,et al.  An online algorithm for distributed dictionary learning , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Waheed Uz Zaman Bajwa,et al.  Cloud K-SVD: A Collaborative Dictionary Learning Algorithm for Big, Distributed Data , 2014, IEEE Transactions on Signal Processing.

[9]  Trac D. Tran,et al.  Hyperspectral Image Classification Using Dictionary-Based Sparse Representation , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Waheed Uz Zaman Bajwa,et al.  Cloud K-SVD: Computing data-adaptive representations in the cloud , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[11]  Alejandro Ribeiro,et al.  D4L: Decentralized dynamic discriminative dictionary learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Jean-Jacques Fuchs,et al.  On sparse representations in arbitrary redundant bases , 2004, IEEE Transactions on Information Theory.

[13]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[14]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[15]  Ali H. Sayed,et al.  Dictionary Learning Over Distributed Models , 2014, IEEE Transactions on Signal Processing.

[16]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[17]  Larry S. Davis,et al.  Learning a discriminative dictionary for sparse coding via label consistent K-SVD , 2011, CVPR 2011.

[18]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[19]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[20]  Qing Ling,et al.  EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.

[21]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[22]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[23]  V. Climenhaga Markov chains and mixing times , 2013 .

[24]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[25]  Alejandro Ribeiro,et al.  A Saddle Point Algorithm for Networked Online Convex Optimization , 2014, IEEE Transactions on Signal Processing.

[26]  Ali H. Sayed,et al.  Diffusion Adaptation over Networks , 2012, ArXiv.

[27]  Cédric Richard,et al.  Learning a common dictionary over a sensor network , 2013, 2013 5th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[28]  Joseph F. Murray,et al.  Dictionary Learning Algorithms for Sparse Representation , 2003, Neural Computation.

[29]  Baoxin Li,et al.  Discriminative K-SVD for dictionary learning in face recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Junli Liang,et al.  Distributed Dictionary Learning for Sparse Representation in Sensor Networks , 2014, IEEE Transactions on Image Processing.

[31]  Mrityunjay Kumar,et al.  Sparse Image and Signal Processing: Wavelets, Curvelets, Morphological Diversity, by Jean-Luc Starck, Fionn Murtagh, and Jalal M. Fadili , 2007 .

[32]  Stephen P. Boyd,et al.  Mixing Times for Random Walks on Geometric Random Graphs , 2005, ALENEX/ANALCO.

[33]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[34]  Marc Teboulle,et al.  Gradient-based algorithms with applications to signal-recovery problems , 2010, Convex Optimization in Signal Processing and Communications.

[35]  Michael Elad,et al.  Efficient Implementation of the K-SVD Algorithm using Batch Orthogonal Matching Pursuit , 2008 .

[36]  Jean Ponce,et al.  Task-Driven Dictionary Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.