Screening Tests for Lasso Problems

This paper is a survey of dictionary screening for the lasso problem. The lasso problem seeks a sparse linear combination of the columns of a dictionary to best match a given target vector. This sparse representation has proven useful in a variety of subsequent processing and decision tasks. For a given target vector, dictionary screening quickly identifies a subset of dictionary columns that will receive zero weight in a solution of the corresponding lasso problem. These columns can be removed from the dictionary prior to solving the lasso problem without impacting the optimality of the solution obtained. This has two potential advantages: it reduces the size of the dictionary, allowing the lasso problem to be solved with less resources, and it may speed up obtaining a solution. Using a geometrically intuitive framework, we provide basic insights for understanding useful lasso screening tests and their limitations. We also provide illustrative numerical studies on several datasets.

[1]  Cees G. M. Snoek,et al.  Variable Selection , 2019, Model-Based Clustering and Classification for Data Science.

[2]  Xu Chen,et al.  Feedback-Controlled Sequential Lasso Screening , 2016, ArXiv.

[3]  Qiang Zhou,et al.  Safe Subspace Screening for Nuclear Norm Regularized Least Squares Problems , 2015, ICML.

[4]  Kush R. Varshney,et al.  Screening for learning classification rules via Boolean compressed sensing , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Xu Chen,et al.  Collaborative representation, sparsity or nonlinearity: What is key to dictionary based classification? , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Ping-Keng Jao,et al.  Modified lasso screening for audio word-based music classification using large-scale dictionary , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Zehua Chen,et al.  Sequential Lasso Cum EBIC for Feature Selection With Ultra-High Dimensional Feature Space , 2014 .

[8]  Xu Chen,et al.  Sparse representation classification via sequential Lasso screening , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[9]  Jieping Ye,et al.  Safe Screening With Variational Inequalities and Its Applicaiton to LASSO , 2013, ICML.

[10]  Pingmei Xu,et al.  Three structural results on the lasso problem , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Yun Wang,et al.  Tradeoffs in improved screening of lasso problems , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Yun Wang,et al.  Lasso screening with a small regularization parameter , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Xu Chen,et al.  Music genre classification using multiscale scattering and sparse representations , 2013, 2013 47th Annual Conference on Information Sciences and Systems (CISS).

[14]  Jie Wang,et al.  Lasso screening rules via dual polytope projection , 2012, J. Mach. Learn. Res..

[15]  Kristiaan Pelckmans,et al.  An ellipsoid based, two-stage screening test for BPDN , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[16]  R. Tibshirani The Lasso Problem and Uniqueness , 2012, 1206.0313.

[17]  Julien Mairal,et al.  Complexity Analysis of the Lasso Regularization Path , 2012, ICML.

[18]  Deng Cai,et al.  Manifold Adaptive Experimental Design for Text Categorization , 2012, IEEE Transactions on Knowledge and Data Engineering.

[19]  Peter J. Ramadge,et al.  Fast lasso screening tests based on correlations , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Hossein Mobahi,et al.  Toward a Practical Face Recognition System: Robust Alignment and Illumination by Sparse Representation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Hao Xu,et al.  Learning Sparse Representations of High Dimensional Data on Large Scale Dictionaries , 2011, NIPS.

[22]  Lei Zhang,et al.  Sparse representation or collaborative representation: Which helps face recognition? , 2011, 2011 International Conference on Computer Vision.

[23]  Vikas Sindhwani,et al.  Emerging topic detection using dictionary learning , 2011, CIKM '11.

[24]  Zehua Chen,et al.  Sequential Lasso for feature selection with ultra-high dimensional feature space , 2011, 1107.2734.

[25]  R. Tibshirani,et al.  Strong rules for discarding predictors in lasso‐type problems , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[26]  Laurent El Ghaoui,et al.  Safe Feature Elimination for the LASSO and Sparse Supervised Learning Problems , 2010, 1009.4219.

[27]  Laurent El Ghaoui,et al.  Safe Feature Elimination in Sparse Supervised Learning , 2010, ArXiv.

[28]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[29]  Allen Y. Yang,et al.  A Review of Fast L(1)-Minimization Algorithms for Robust Face Recognition , 2010 .

[30]  S. Sastry,et al.  Fast L1-Minimization Algorithms For Robust Face Recognition , 2010 .

[31]  R. Tibshirani,et al.  The solution path of the generalized lasso , 2010, 1005.1971.

[32]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[33]  Tara N. Sainath,et al.  Bayesian compressive sensing for phonetic classification , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[34]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[35]  Guillermo Sapiro,et al.  Non-local sparse models for image restoration , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[36]  A. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Michael Elad,et al.  Sparse Representation for Color Image Restoration , 2008, IEEE Transactions on Image Processing.

[38]  Joel A. Tropp,et al.  Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit , 2007, IEEE Transactions on Information Theory.

[39]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[40]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[41]  Jean-Jacques Fuchs,et al.  Recovery of exact sparse representations in the presence of bounded noise , 2005, IEEE Transactions on Information Theory.

[42]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  James Theiler,et al.  Online Feature Selection using Grafting , 2003, ICML.

[44]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[45]  P. Belhumeur,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Stanley Zionts,et al.  Techniques for Removing Nonbinding Constraints and Extraneous Variables from Linear Programming Problems , 1966 .

[47]  Leon Hirsch,et al.  Fundamentals Of Convex Analysis , 2016 .

[48]  Zhen James Xiang,et al.  Combining Structural Knowledge with Sparsity in Machine Learning and Signal Processing , 2012 .

[49]  Joakim Andén,et al.  Multiscale Scattering for Audio Classification , 2011, ISMIR.

[50]  Tara N. Sainath,et al.  Sparse representation features for speech recognition , 2010, INTERSPEECH.

[51]  Jyh-Shing Roger Jang,et al.  Music Genre Classification via Compressive Sampling , 2010, ISMIR.

[52]  Allen Y. Yang,et al.  A Review of Fast l1-Minimization Algorithms for Robust Face Recognition , 2010, ArXiv.

[53]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[54]  C. Robert Discussion of "Sure independence screening for ultra-high dimensional feature space" by Fan and Lv. , 2008 .

[55]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[56]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[57]  S M Smith,et al.  Overview of fMRI analysis. , 2004, The British journal of radiology.

[58]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[59]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[60]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[61]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.