Beyond L2-loss functions for learning sparse models

In sparse learning, the squared Euclidean distance is a popular choice for measuring the approximation quality. However, the use of other forms of parametrized loss functions, including asymmetric losses, has generated research interest. In this paper, we perform sparse learning using a broad class of smooth piecewise linear quadratic (PLQ) loss functions, including robust and asymmetric losses that are adaptable to many real-world scenarios. The proposed framework also supports heterogeneous data modeling by allowing different PLQ penalties for different blocks of residual vectors (split-PLQ). We demonstrate the impact of the proposed sparse learning in image recovery, and apply the proposed split-PLQ loss approach to tag refinement for image annotation and retrieval.

[1]  R. Tyrrell Rockafellar,et al.  Variational Analysis , 1998, Grundlehren der mathematischen Wissenschaften.

[2]  Kush R. Varshney,et al.  Quantile regression for workforce analytics , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[3]  Carl Taswell,et al.  The what, how, and why of wavelet shrinkage denoising , 2000, Comput. Sci. Eng..

[4]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[5]  Kjersti Engan,et al.  Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[6]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[7]  Shuicheng Yan,et al.  Learning With $\ell ^{1}$-Graph for Image Analysis , 2010, IEEE Transactions on Image Processing.

[8]  Bastian Goldlücke,et al.  Variational Analysis , 2014, Computer Vision, A Reference Guide.

[9]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[10]  Marc Moonen,et al.  Sparse Linear Prediction and Its Applications to Speech Processing , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Shuicheng Yan,et al.  Semi-supervised Learning by Sparse Representation , 2009, SDM.

[12]  Barak A. Pearlmutter,et al.  Blind Source Separation by Sparse Decomposition in a Signal Dictionary , 2001, Neural Computation.

[13]  Sridhar Krishna Nemala,et al.  Sparse coding for speech recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Vikas Sindhwani,et al.  Emerging topic detection using dictionary learning , 2011, CIKM '11.

[15]  Alexander J. Smola,et al.  Nonparametric Quantile Estimation , 2006, J. Mach. Learn. Res..

[16]  Rémi Gribonval,et al.  A survey of Sparse Component Analysis for blind source separation: principles, perspectives, and new challenges , 2006, ESANN.

[17]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[18]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[19]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[20]  Høgskolen i Stavanger FRAME DESIGN USING FOCUSS WITH METHOD OF OPTIMAL DIRECTIONS (MOD) , 2000 .

[21]  Aleksandr Y. Aravkin,et al.  Sparse/robust estimation and Kalman smoothing with nonsmooth log-concave densities: modeling, computation, and theory , 2013, J. Mach. Learn. Res..

[22]  R. Koenker,et al.  Regression Quantiles , 2007 .

[23]  K M Søndergaard,et al.  [Understanding statistics?]. , 1995, Ugeskrift for laeger.

[24]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[25]  Ronny Luss,et al.  Orthogonal Matching Pursuit for Sparse Quantile Regression , 2014, 2014 IEEE International Conference on Data Mining.

[26]  Roland Miezianko,et al.  Dictionary learning for robust background modeling , 2011, 2011 IEEE International Conference on Robotics and Automation.

[27]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[28]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[29]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[30]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[31]  Lei Zhang,et al.  Multi-label sparse coding for automatic image annotation , 2009, CVPR.

[32]  Ke Huang,et al.  Sparse Representation for Signal Classification , 2006, NIPS.

[33]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[34]  E. Abt Understanding statistics 3 , 2010, Evidence-Based Dentistry.

[35]  Ronny Luss,et al.  Sparse Quantile Huber Regression for Efficient and Robust Estimation , 2014, ArXiv.

[36]  Karthikeyan Natesan Ramamurthy,et al.  Image Understanding Using Sparse Representations , 2014, Synthesis Lectures on Image, Video, and Multimedia Processing.

[37]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[38]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[39]  Guillermo Sapiro,et al.  Classification and clustering via dictionary learning with structured incoherence and shared features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Michael Elad,et al.  Dictionaries for Sparse Representation Modeling , 2010, Proceedings of the IEEE.

[41]  Massimiliano Pontil,et al.  Sparse coding for multitask and transfer learning , 2012, ICML.

[42]  Aleksandr Y. Aravkin,et al.  Linear system identification using stable spline kernels and PLQ penalties , 2013, 52nd IEEE Conference on Decision and Control.

[43]  Mary E. Barth,et al.  Relative valuation roles of equity book value and net income as a function of financial health , 1998 .

[44]  Shuicheng Yan,et al.  Image tag refinement towards low-rank, content-tag prior and error sparsity , 2010, ACM Multimedia.

[45]  Pascal Frossard,et al.  Dictionary Learning , 2011, IEEE Signal Processing Magazine.

[46]  Tuomas Virtanen,et al.  Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[47]  Yang Yu,et al.  Automatic image annotation using group sparsity , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[48]  Yuanqing Li,et al.  Analysis of Sparse Representation and Blind Source Separation , 2004, Neural Computation.

[49]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Arindam Banerjee,et al.  Online (cid:96) 1 -Dictionary Learning with Application to Novel Document Detection , 2012 .

[51]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[52]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Patrick L. Combettes,et al.  Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[54]  James V. Burke,et al.  Epi-convergent Smoothing with Applications to Convex Composite Functions , 2012, SIAM J. Optim..

[55]  Michael B. Wakin Sparse Image and Signal Processing: Wavelets, Curvelets, Morphological Diversity (Starck, J.-L., et al; 2010) [Book Reviews] , 2011, IEEE Signal Processing Magazine.