Sparse estimation from noisy observations of an overdetermined linear system

This note studies a method for the estimation of a finite number of unknown parameters from linear equations, which are perturbed by Gaussian noise. In the case the unknown parameters have only few nonzero entries, the proposed estimator performs more efficiently than a traditional approach. The method consists of three steps: (1) a classical Least Squares Estimate (LSE); (2) the support is recovered through a Linear Programming (LP) optimization problem which can be computed using a soft-thresholding step; (3) a de-biasing step using a LSE on the estimated support set. The main contribution of this note is a formal derivation of an associated ORACLE property of the final estimate. That is, with probability 1, the estimate equals the LSE based on the support of the true parameters when the number of observations goes to infinity.

[1]  Hans C. van Houwelingen,et al.  The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer, New York, 2001. No. of pages: xvi+533. ISBN 0‐387‐95284‐5 , 2004 .

[2]  Petre Stoica,et al.  Decentralized Control , 2018, The Control Systems Handbook.

[3]  I. Johnstone,et al.  Adapting to Unknown Smoothness via Wavelet Shrinkage , 1995 .

[4]  Kang Li,et al.  Variable selection via RIVAL (removing irrelevant variables amidst Lasso iterations) and its application to nuclear material detection , 2012, Autom..

[5]  H. Leeb,et al.  Sparse Estimators and the Oracle Property, or the Return of Hodges' Estimator , 2007, 0704.1466.

[6]  E.J. Candes,et al.  An Introduction To Compressive Sampling , 2008, IEEE Signal Processing Magazine.

[7]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  R. Plackett Some theorems in least squares. , 1950, Biometrika.

[10]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[11]  R. Durrett Probability: Theory and Examples , 1993 .

[12]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2001, Springer Series in Statistics.

[13]  R. Tibshirani,et al.  Discussion: The Dantzig selector: Statistical estimation when p is much larger than n , 2007, 0803.3126.

[14]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[15]  Sunil L. Kukreja Application of a least absolute shrinkage and selection operator to aeroelastic flight test data , 2009, Int. J. Control.

[16]  Petre Stoica,et al.  Introduction to spectral analysis , 1997 .

[17]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[18]  Håkan Hjalmarsson,et al.  Sparse estimation based on a validation criterion , 2011, IEEE Conference on Decision and Control and European Control Conference.

[19]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[20]  B. K. Kale A note on the super efficient estimator , 1985 .