The Lasso under Heteroscedasticity

The performance of the Lasso is well understood under the assumptions of the standard linear model with homoscedastic noise. However, in several appli- cations, the standard model does not describe the important features of the data. This paper examines how the Lasso performs on a non-standard model that is mo- tivated by medical imaging applications. In these applications, the variance of the noise scales linearly with the expectation of the observation. Like all heteroscedas- tic models, the noise terms in this Poisson-like model are not independent of the design matrix. More specically, this paper studies the sign consistency of the Lasso under a sparse Poisson-like model. In addition to studying sucient conditions for the sign consistency of the Lasso estimate, this paper also gives necessary conditions for sign consistency. Both sets of conditions are comparable to results for the homoscedastic model, showing that when a measure of the signal to noise ratio is large, the Lasso performs well on both Poisson-like data and homoscedastic data. Simulations reveal that the Lasso performs equally well in terms of model selec- tion performance on both Poisson-like data and homoscedastic data (with properly scaled noise variance), across a range of parameterizations. Taken as a whole, these results suggest that the Lasso is robust to the Poisson-like heteroscedastic noise.

[1]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[2]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[3]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[4]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[5]  Michael I. Jordan,et al.  Union support recovery in high-dimensional multivariate regression , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[6]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[7]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[8]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[9]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[10]  M. R. Osborne,et al.  On the LASSO and its Dual , 2000 .

[11]  Peng Zhao,et al.  Stagewise Lasso , 2007, J. Mach. Learn. Res..

[12]  Joel A. Tropp,et al.  Just relax: convex programming methods for identifying sparse signals in noise , 2006, IEEE Transactions on Information Theory.

[13]  Michael Elad,et al.  A generalized uncertainty principle and sparse representation in pairs of bases , 2002, IEEE Trans. Inf. Theory.

[14]  J. Fessler Statistical Image Reconstruction Methods for Transmission Tomography , 2000 .

[15]  Michael Elad,et al.  Stable recovery of sparse overcomplete representations in the presence of noise , 2006, IEEE Transactions on Information Theory.

[16]  S. Chatterjee An error bound in the Sudakov-Fernique inequality , 2005, math/0510424.

[17]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[18]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[19]  S. Szarek,et al.  Chapter 8 - Local Operator Theory, Random Matrices and Banach Spaces , 2001 .

[20]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[21]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[22]  Saharon Rosset,et al.  Tracking Curved Regularized Optimization Solution Paths , 2004, NIPS 2004.

[23]  Martin J. Wainwright,et al.  Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[24]  L. Shepp,et al.  A Statistical Model for Positron Emission Tomography , 1985 .

[25]  Xiaoming Huo,et al.  Uncertainty principles and ideal atomic decomposition , 2001, IEEE Trans. Inf. Theory.

[26]  Martin J. Wainwright,et al.  Sharp thresholds for high-dimensional and noisy recovery of sparsity , 2006, ArXiv.

[27]  Arkadi Nemirovski,et al.  On sparse representation in pairs of bases , 2003, IEEE Trans. Inf. Theory.

[28]  Jean-Jacques Fuchs,et al.  Recovery of exact sparse representations in the presence of bounded noise , 2005, IEEE Transactions on Information Theory.

[29]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[30]  David A. Freedman,et al.  Statistical Models: Theory and Practice: References , 2005 .

[31]  M. Lustig,et al.  Compressed Sensing MRI , 2008, IEEE Signal Processing Magazine.