We consider the regression model with observation error in the design: y = Xµ ⁄ + »; Z = X + ¥: Here the random vector y 2R n and the random n£p matrix Z are observed, the n £ p matrix X is unknown, ¥ is an n £ p random noise matrix, » 2 R n is a random noise vector, and µ ⁄ is a vector of unknown parameters to be estimated. We consider the setting where the dimension p can be much larger than the sample size n and µ ⁄ is sparse. Because of the presence of the noise matrix ¥, the commonly used Lasso and Dantzig selector are unstable. An alternative procedure called the Matrix Uncertainty (MU) selector has been proposed in Rosenbaum and Tsybakov (2010) in order to account for the noise. The properties of the MU selector have been studied in Rosenbaum and Tsybakov (2010) for sparse µ ⁄ under the assumption that the noise matrix ¥ is deterministic and its values are small. In this paper, we propose a modiflcation of the MU selector when ¥ is a random matrix with zero-mean entries having the variances that can be estimated. This is, for example, the case in the model where the entries of X are missing at random. We show both theoretically and numerically that, under these conditions, the new estimator called the Compensated MU selector achieves better accuracy of estimation than the original MU selector.
[1]
A. Shiryayev.
On Sums of Independent Random Variables
,
1992
.
[2]
A. Tsybakov,et al.
Aggregation for Gaussian regression
,
2007,
0710.3654.
[3]
Terence Tao,et al.
The Dantzig selector: Statistical estimation when P is much larger than n
,
2005,
math/0506081.
[4]
A. Tsybakov,et al.
Sparsity oracle inequalities for the Lasso
,
2007,
0705.3308.
[5]
Karim Lounici.
Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators
,
2008,
0801.4610.
[6]
V. Koltchinskii.
The Dantzig selector and sparsity oracle inequalities
,
2009,
0909.0861.
[7]
P. Bickel,et al.
SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR
,
2008,
0801.1095.
[8]
A. Tsybakov,et al.
Sparse recovery under matrix uncertainty
,
2008,
0812.2818.
[9]
Victor Chernozhukov,et al.
High Dimensional Sparse Econometric Models: An Introduction
,
2011,
1106.5242.
[10]
V. Koltchinskii,et al.
Oracle inequalities in empirical risk minimization and sparse recovery problems
,
2011
.
[11]
Sara van de Geer,et al.
Statistics for High-Dimensional Data
,
2011
.
[12]
A. Tsybakov,et al.
High-dimensional instrumental variables regression and confidence sets -- v2/2012
,
2018,
1812.11330.