Identification of overcomplete dictionaries and their application in distributed classification problems

OF THE THESIS Identification of overcomplete dictionaries and their application in distributed classification problems by Zahra Shakeri Thesis Director: Prof. Waheed U. Bajwa The work presented in this thesis aims to study the conditions essential for reliable dictionary recovery based on the maximal response criterion and exploit the application of dictionary learning in classification of distributed data. The first part of this thesis revisits the problem of recovery of an overcomplete dictionary in a local neighborhood from training samples using the so-called maximal response criterion. While it is known in the literature that the maximal response criterion can be used for asymptotic exact recovery of a dictionary in a local neighborhood, those results do not allow for linear (in the ambient dimension) scaling of sparsity levels in signal representations. The first contribution in this work is introducing a new condition for the sparse representation of signals and leveraging a new proof technique to establish that maximal response criterion can in fact handle linear sparsity (modulo a logarithmic factor) of signal representations. While the focus of this work is on asymptotic exact recovery, the same ideas can be used in a straightforward manner to strengthen the original maximal response criterion-based results involving noisy observations and finite number of training samples. The second part of this thesis addresses the problem of collaborative training of nonlinear classifiers using big, distributed training data. The proposed supervised learning

[1]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[2]  François Poulet,et al.  Classifying one billion data with a new distributed svm algorithm , 2006, 2006 International Conference onResearch, Innovation and Vision for the Future.

[3]  Sanjeev Arora,et al.  New Algorithms for Learning Incoherent and Overcomplete Dictionaries , 2013, COLT.

[4]  Waheed Uz Zaman Bajwa,et al.  Cloud K-SVD: Computing data-adaptive representations in the cloud , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[5]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[6]  Pascal Frossard,et al.  Distributed SVM Applied to Image Classification , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[7]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[8]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[9]  Georgios B. Giannakis,et al.  Consensus-based distributed linear support vector machines , 2010, IPSN '10.

[10]  A. Robert Calderbank,et al.  Why Gabor frames? Two fundamental measures of coherence and their role in model selection , 2010, Journal of Communications and Networks.

[11]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[12]  Georgios B. Giannakis,et al.  Consensus-Based Distributed Support Vector Machines , 2010, J. Mach. Learn. Res..

[13]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[14]  Kazuoki Azuma WEIGHTED SUMS OF CERTAIN DEPENDENT RANDOM VARIABLES , 1967 .

[15]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[16]  H. Vincent Poor,et al.  A Collaborative Training Algorithm for Distributed Learning , 2009, IEEE Transactions on Information Theory.

[17]  Jean Ponce,et al.  Task-Driven Dictionary Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Karin Schnass,et al.  Convergence radius and sample complexity of ITKM algorithms for dictionary learning , 2015, Applied and Computational Harmonic Analysis.

[19]  Antonio Irpino,et al.  Supervised classification of distributed data streams for smart grids , 2012 .

[20]  Anima Anandkumar,et al.  Exact Recovery of Sparsely Used Overcomplete Dictionaries , 2013, ArXiv.

[21]  Baltasar Beferull-Lozano,et al.  Distributed consensus algorithms for SVM training in wireless sensor networks , 2008, 2008 16th European Signal Processing Conference.

[22]  Samuel Madden,et al.  Distributed regression: an efficient framework for modeling sensor network data , 2004, Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004.

[23]  Panagiotis Tsakalides,et al.  Training a SVM-based classifier in distributed sensor networks , 2006, 2006 14th European Signal Processing Conference.

[24]  Guillermo Sapiro,et al.  Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Prateek Jain,et al.  Learning Sparsely Used Overcomplete Dictionaries via Alternating Minimization , 2013, SIAM J. Optim..

[26]  P. Tsakalides,et al.  Optimal gossip algorithm for distributed consensus SVM training in wireless sensor networks , 2009, 2009 16th International Conference on Digital Signal Processing.

[27]  Dustin G. Mixon,et al.  Two are better than one: Fundamental parameters of frame coherence , 2011, 1103.0435.

[28]  Igor Durdanovic,et al.  Parallel Support Vector Machines: The Cascade SVM , 2004, NIPS.

[29]  Huan Wang,et al.  On the local correctness of ℓ1-minimization for dictionary learning , 2011, 2014 IEEE International Symposium on Information Theory.

[30]  Colin McDiarmid,et al.  Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[31]  Rémi Gribonval,et al.  Sparse and Spurious: Dictionary Learning With Noise and Outliers , 2014, IEEE Transactions on Information Theory.

[32]  Fabian J. Theis,et al.  Sparse component analysis and blind source separation of underdetermined mixtures , 2005, IEEE Transactions on Neural Networks.

[33]  Guillermo Sapiro,et al.  Sparse representations for image classification: learning discriminative and reconstructive non-parametric dictionaries , 2008 .

[34]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[35]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[36]  Emilio Parrado-Hernández,et al.  Distributed support vector machines , 2006, IEEE Trans. Neural Networks.

[37]  Lloyd R. Welch,et al.  Lower bounds on the maximum cross correlation of signals (Corresp.) , 1974, IEEE Trans. Inf. Theory.

[38]  Karin Schnass,et al.  Local identification of overcomplete dictionaries , 2014, J. Mach. Learn. Res..

[39]  A.M. Sayeed,et al.  Data versus decision fusion for classification in sensor networks , 2003, Sixth International Conference of Information Fusion, 2003. Proceedings of the.

[40]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[41]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[42]  Svetha Venkatesh,et al.  Joint learning and dictionary construction for pattern recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Angelia Nedic,et al.  Distributed Random Projection Algorithm for Convex Optimization , 2012, IEEE Journal of Selected Topics in Signal Processing.

[44]  Joseph F. Murray,et al.  Dictionary Learning Algorithms for Sparse Representation , 2003, Neural Computation.

[45]  Edward Y. Chang,et al.  Parallelizing Support Vector Machines on Distributed Computers , 2007, NIPS.

[46]  Huan Wang,et al.  Exact Recovery of Sparsely-Used Dictionaries , 2012, COLT.

[47]  Baoxin Li,et al.  Discriminative K-SVD for dictionary learning in face recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[48]  Vwani P. Roychowdhury,et al.  Distributed Parallel Support Vector Machines in Strongly Connected Networks , 2008, IEEE Transactions on Neural Networks.

[49]  Angelia Nedic,et al.  DrSVM: Distributed random projection algorithms for SVMs , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[50]  Michael Elad,et al.  Stable recovery of sparse overcomplete representations in the presence of noise , 2006, IEEE Transactions on Information Theory.

[51]  V. Ramachandran,et al.  Distributed multitarget classification in wireless sensor networks , 2005, IEEE Journal on Selected Areas in Communications.

[52]  A. Bruckstein,et al.  On the uniqueness of overcomplete dictionaries, and a practical way to retrieve them , 2006 .

[53]  Larry S. Davis,et al.  Learning a discriminative dictionary for sparse coding via label consistent K-SVD , 2011, CVPR 2011.

[54]  Rajeev Motwani,et al.  Randomized Algorithms , 1995, SIGA.

[55]  Stephen P. Boyd,et al.  Fast linear iterations for distributed averaging , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[56]  S. Sundararajan,et al.  A Distributed Algorithm for Training Nonlinear Kernel Machines , 2014, ArXiv.

[57]  V. Ramachandran,et al.  Distributed classification of Gaussian space-time sources in wireless sensor networks , 2004, IEEE Journal on Selected Areas in Communications.

[58]  Karin Schnass,et al.  On the Identifiability of Overcomplete Dictionaries via the Minimisation Principle Underlying K-SVD , 2013, ArXiv.