The main contributions of robust statistics to statistical science and a new challenge

In the first part of the paper, we trace the development of robust statistics through its main contributions which have penetrated mainstream statistics. The goal of this paper is neither to provide a full overview of robust statistics, nor to make a complete list of its tools and methods, but to focus on basic concepts that have become standard ideas and tools in modern statistics. In the second part we focus on the particular challenge provided by high-dimensional statistics and discuss how robustness ideas can be used and adapted to this situation.

[1]  N. Meinshausen,et al.  Minimum Distance Lasso for robust high-dimensional regression , 2016 .

[2]  S. Sheather,et al.  Robust Estimation & Testing: Staudte/Robust , 1990 .

[3]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[4]  Andrea Montanari,et al.  High dimensional robust M-estimation: asymptotic variance via approximate message passing , 2013, Probability Theory and Related Fields.

[5]  Jianqing Fan,et al.  ADAPTIVE ROBUST VARIABLE SELECTION. , 2012, Annals of statistics.

[6]  Peter Bühlmann Regression shrinkage and selection via the Lasso: a retrospective (Robert Tibshirani): Comments on the presentation , 2011 .

[7]  R. Koenker Quantile Regression: Name Index , 2005 .

[8]  Anthony C. Atkinson,et al.  Robust Diagnostic Regression Analysis , 2000 .

[9]  L. Hansen Large Sample Properties of Generalized Method of Moments Estimators , 1982 .

[10]  Sivaraman Balakrishnan,et al.  Robust estimation via robust gradient estimation , 2018, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[11]  Howard Wainer,et al.  Robust Regression & Outlier Detection , 1988 .

[12]  H. Rieder Robust asymptotic statistics , 1994 .

[13]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[14]  P. J. Huber A Robust Version of the Probability Ratio Test , 1965 .

[15]  S. Sheather,et al.  Robust Estimation and Testing , 1990 .

[16]  Marco Avella-Medina Influence functions for penalized M-estimators , 2017 .

[17]  F. Hampel The Influence Curve and Its Role in Robust Estimation , 1974 .

[18]  Regina Y. Liu On a Notion of Data Depth Based on Random Simplices , 1990 .

[19]  Trevor Hastie,et al.  Statistical Learning with Sparsity: The Lasso and Generalizations , 2015 .

[20]  Werner A. Stahel,et al.  Robust Statistics: The Approach Based on Influence Functions , 1987 .

[21]  H. White A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity , 1980 .

[22]  John W. Tukey,et al.  Configural Polysampling: A Route to Practical Robustness. , 1993 .

[23]  Matthieu Lerasle,et al.  ROBUST MACHINE LEARNING BY MEDIAN-OF-MEANS: THEORY AND PRACTICE , 2019 .

[24]  Roy E. Welsch,et al.  Robust variable selection using least angle regression and elemental set sampling , 2007, Comput. Stat. Data Anal..

[25]  V. P. Godambe An Optimum Property of Regular Maximum Likelihood Estimation , 1960 .

[26]  Kuldeep Kumar,et al.  Robust Statistics, 2nd edn , 2011 .

[27]  J. Tukey The Future of Data Analysis , 1962 .

[28]  Christophe Croux,et al.  Sparse least trimmed squares regression for analyzing high-dimensional large data sets , 2013, 1304.4773.

[29]  Alessio Farcomeni,et al.  Robust Methods for Data Reduction , 2015 .

[30]  D. Donoho 50 Years of Data Science , 2017 .

[31]  V. Yohai,et al.  Robust Statistics: Theory and Methods , 2006 .

[32]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[33]  P. J. Huber The behavior of maximum likelihood estimates under nonstandard conditions , 1967 .

[34]  E. Ronchetti,et al.  Robust and consistent variable selection in high-dimensional generalized linear models , 2018 .

[35]  Jianqing Fan,et al.  A Selective Overview of Variable Selection in High Dimensional Feature Space. , 2009, Statistica Sinica.

[36]  Peter J. Huber,et al.  Robust Statistics , 2005, Wiley Series in Probability and Statistics.

[37]  Brenton R. Clarke Robustness Theory and Application , 2018 .

[38]  K. Do,et al.  Efficient and Adaptive Estimation for Semiparametric Models. , 1994 .

[39]  Y. She,et al.  Robust reduced-rank regression , 2015, Biometrika.

[40]  Stephane Heritier,et al.  Robust Methods in Biostatistics , 2009 .

[41]  Jun Zhang,et al.  Robust rank correlation based screening , 2010, 1012.4255.

[42]  R. V. Mises On the Asymptotic Distribution of Differentiable Statistical Functions , 1947 .

[43]  P. J. Huber,et al.  Minimax Tests and the Neyman-Pearson Lemma for Capacities , 1973 .

[44]  Paul Tseng,et al.  Robust wavelet denoising , 2001, IEEE Trans. Signal Process..

[45]  A. Belloni,et al.  L1-Penalized Quantile Regression in High Dimensional Sparse Models , 2009, 0904.2931.

[46]  Regina Y. Liu,et al.  Regression depth. Commentaries. Rejoinder , 1999 .

[47]  A. H. Welsh,et al.  Aspects of Statistical Inference: Welsh/Aspects , 1996 .

[48]  Anthony C. Atkinson,et al.  Exploring Multivariate Data with the Forward Search , 2004 .

[49]  William J. J. Rey,et al.  Robust statistical methods , 1978 .

[50]  Heping Zhang,et al.  Robust Variable Selection With Exponential Squared Loss , 2013, Journal of the American Statistical Association.

[51]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[52]  Asit P. Basu,et al.  Aspects of Statistical Inference , 1996, Technometrics.

[53]  Po-Ling Loh,et al.  Statistical consistency and asymptotic normality for high-dimensional robust M-estimators , 2015, ArXiv.

[54]  E. Ronchetti,et al.  Robust statistics: a selective overview and new directions , 2015 .

[55]  Yiyuan She,et al.  Outlier Detection Using Nonconvex Penalized Regression , 2010, ArXiv.

[56]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[57]  Qiang Sun,et al.  Adaptive Huber Regression , 2017, Journal of the American Statistical Association.

[58]  P. J. Huber Robust Estimation of a Location Parameter , 1964 .