Variance Breakdown of Huber (M)-estimators: $n/p \rightarrow m \in (1,\infty)$

A half century ago, Huber evaluated the minimax asymptotic variance in scalar location estimation, $ \min_\psi \max_{F \in {\cal F}_\epsilon} V(\psi, F) = \frac{1}{I(F_\epsilon^*)} $, where $V(\psi,F)$ denotes the asymptotic variance of the $(M)$-estimator for location with score function $\psi$, and $I(F_\epsilon^*)$ is the minimal Fisher information $ \min_{{\cal F}_\epsilon} I(F)$ over the class of $\epsilon$-Contaminated Normal distributions. We consider the linear regression model $Y = X\theta_0 + W$, $W_i\sim_{\text{i.i.d.}}F$, and iid Normal predictors $X_{i,j}$, working in the high-dimensional-limit asymptotic where the number $n$ of observations and $p$ of variables both grow large, while $n/p \rightarrow m \in (1,\infty)$; hence $m$ plays the role of `asymptotic number of observations per parameter estimated'. Let $V_m(\psi,F)$ denote the per-coordinate asymptotic variance of the $(M)$-estimator of regression in the $n/p \rightarrow m$ regime. Then $V_m \neq V$; however $V_m \rightarrow V$ as $m \rightarrow \infty$. In this paper we evaluate the minimax asymptotic variance of the Huber $(M)$-estimate. The statistician minimizes over the family $(\psi_\lambda)_{\lambda > 0}$ of all tunings of Huber $(M)$-estimates of regression, and Nature maximizes over gross-error contaminations $F \in {\cal F}_\epsilon$. Suppose that $I(F_\epsilon^*) \cdot m > 1$. Then $ \min_\lambda \max_{F \in {\cal F}_\epsilon} V_m(\psi_\lambda, F) = \frac{1}{I(F_\epsilon^*) - 1/m} $. Strikingly, if $I(F_\epsilon^*) \cdot m \leq 1$, then the minimax asymptotic variance is $+\infty$. The breakdown point is where the Fisher information per parameter equals unity.

[1]  Cuthbert Daniel,et al.  Fitting Equations to Data: Computer Analysis of Multifactor Data , 1980 .

[2]  P. J. Huber Robust Regression: Asymptotics, Conjectures and Monte Carlo , 1973 .

[3]  F. Hampel The Influence Curve and Its Role in Robust Estimation , 1974 .

[4]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[5]  S. Portnoy Asymptotic behavior of M-estimators of p regression parameters when p , 1985 .

[6]  V. Serdobolʹskiĭ Multivariate statistical analysis : a high-dimensional approach , 2000 .

[7]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[8]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[9]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[10]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[11]  P. Bickel,et al.  Optimal M-estimation in high-dimensional regression , 2013, Proceedings of the National Academy of Sciences.

[12]  P. Bickel,et al.  On robust regression with high-dimensional predictors , 2013, Proceedings of the National Academy of Sciences.

[13]  Noureddine El Karoui,et al.  Asymptotic behavior of unregularized and ridge-regularized high-dimensional robust regression estimators : rigorous results , 2013, 1311.2445.

[14]  Andrea Montanari,et al.  High dimensional robust M-estimation: asymptotic variance via approximate message passing , 2013, Probability Theory and Related Fields.