DASSO: connections between the Dantzig selector and lasso

Summary.  We propose a new algorithm, DASSO, for fitting the entire coefficient path of the Dantzig selector with a similar computational cost to the least angle regression algorithm that is used to compute the lasso. DASSO efficiently constructs a piecewise linear path through a sequential simplex‐like algorithm, which is remarkably similar to the least angle regression algorithm. Comparison of the two algorithms sheds new light on the question of how the lasso and Dantzig selector are related. In addition, we provide theoretical conditions on the design matrix X under which the lasso and Dantzig selector coefficient estimates will be identical for certain tuning parameters. As a consequence, in many instances, we can extend the powerful non‐asymptotic bounds that have been developed for the Dantzig selector to the lasso. Finally, through empirical studies of simulated and real world data sets we show that in practice, when the bounds hold for the Dantzig selector, they almost always also hold for the lasso.

[1]  Hanif D. Sherali,et al.  Linear Programming and Network Flows , 1977 .

[2]  L. Breiman Better subset regression using the nonnegative garrote , 1995 .

[3]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[4]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[5]  D. Donoho,et al.  Atomic Decomposition by Basis Pursuit , 2001 .

[6]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[7]  Xiaotong Shen,et al.  Adaptive Model Selection , 2002 .

[8]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[9]  D. Madigan,et al.  [Least Angle Regression]: Discussion , 2004 .

[10]  B. Turlach Discussion of "Least Angle Regression" by Efron, Hastie, Johnstone and Tibshirani , 2004 .

[11]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[12]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[13]  D. Donoho,et al.  Sparse nonnegative solution of underdetermined linear equations by linear programming. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[14]  D. Donoho For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solution , 2006 .

[15]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[16]  Michael Elad,et al.  Stable recovery of sparse overcomplete representations in the presence of noise , 2006, IEEE Transactions on Information Theory.

[17]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[18]  Emmanuel J. Candès,et al.  Quantitative Robust Uncertainty Principles and Optimally Sparse Decompositions , 2004, Found. Comput. Math..

[19]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[20]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[21]  S. Rosset,et al.  Piecewise linear regularized solution paths , 2007, 0708.2197.

[22]  E. Candès,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[23]  Mark D. Plumbley On Polar Polytopes and the Recovery of Sparse Representations , 2007, IEEE Trans. Inf. Theory.

[24]  N. Meinshausen,et al.  Discussion: A tale of three cousins: Lasso, L2Boosting and Dantzig , 2007, 0803.3134.

[25]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[26]  R. Tibshirani,et al.  Discussion: The Dantzig selector: Statistical estimation when p is much larger than n , 2007, 0803.3126.

[27]  Nicolai Meinshausen,et al.  Relaxed Lasso , 2007, Comput. Stat. Data Anal..

[28]  R. Tibshirani,et al.  Discussion of "the Dantzig selector" , 2007 .

[29]  R. DeVore,et al.  A Simple Proof of the Restricted Isometry Property for Random Matrices , 2008 .

[30]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[31]  Gareth M. James,et al.  A generalized Dantzig selector with shrinkage tuning , 2009 .

[32]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.