How to Marry Robustness and Applied Statistics

A striking feature of most applied statistical analyses is the use of methods that are well known to be sensitive to outliers or to other departures from the postulated model. Since data contamination is often the rule, rather than the exception, we investigate the reasons for this contradictory (and perhaps unintended) choice. We also provide empirical evidence, in a real-world regression problem concerning international trade, of the advantages of a new approach to data analysis based on monitoring. Our approach enhances the applicability of robust techniques and the interpretation of their results, thus yielding a positive step towards a reconciliation between robustness and applied statistics.

[1]  Douglas M. Hawkins,et al.  Outliers Everywhere’, - discussion of ‘Unmasking Multivariate Outliers and Leverage Points , 1990 .

[2]  G. Box NON-NORMALITY AND TESTS ON VARIANCES , 1953 .

[3]  D. G. Simpson,et al.  Unmasking Multivariate Outliers and Leverage Points: Comment , 1990 .

[4]  Alessio Farcomeni,et al.  Strong consistency and robustness of the Forward Search estimator of multivariate location and scatter , 2014, J. Multivar. Anal..

[5]  Francesca Torti,et al.  On consistency factors and efficiency of robust S-estimators , 2014 .

[6]  Alessio Farcomeni,et al.  Robust distances for outlier-free goodness-of-fit testing , 2013, Comput. Stat. Data Anal..

[7]  Anthony C. Atkinson,et al.  A Parametric Framework for the Comparison of Methods of Very Robust Regression , 2014, 1405.5040.

[8]  V. Yohai,et al.  Robust Statistics: Theory and Methods , 2006 .

[9]  Anthony C. Atkinson,et al.  Controlling the size of multivariate outlier tests with the MCD estimator of scatter , 2009, Stat. Comput..

[10]  Alessio Farcomeni,et al.  Error rates for multivariate outlier detection , 2011, Comput. Stat. Data Anal..

[11]  Andrea Cerioli,et al.  Multivariate Outlier Detection With High-Breakdown Estimators , 2010 .

[12]  Marianthi Markatou,et al.  Weighted Likelihood Equations with Bootstrap Root Search , 1998 .

[13]  Frederick Mosteller,et al.  Understanding robust and exploratory data analysis , 1983 .

[14]  Anthony C. Atkinson,et al.  Monitoring robust regression , 2014 .

[15]  Luis Angel García-Escudero,et al.  Generalized Radius Processes for Elliptically Contoured Distributions , 2005 .

[16]  Peter J. Huber,et al.  Robust Statistics , 2005, Wiley Series in Probability and Statistics.

[17]  Anthony C. Atkinson,et al.  Monitoring Random Start Forward Searches for Multivariate Data , 2008 .

[18]  Kuldeep Kumar,et al.  Robust Statistics, 2nd edn , 2011 .

[19]  Egon S. Pearson,et al.  Relation between the shape of population distribution and the robustness of four simple test statistics , 1975 .

[20]  Francesca Torti,et al.  Size and Power of Multivariate Outlier Detection Rules , 2013, Algorithms from and for Nature and Life.

[21]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[22]  D. F. Andrews,et al.  Robust Estimates of Location: Survey and Advances. , 1975 .

[23]  P. J. Huber Robust Estimation of a Location Parameter , 1964 .

[24]  Bent Nielsen,et al.  Corrigendum: Analysis of the forward search using some new results for martingales and empirical processes , 2016, Bernoulli.

[25]  Peter J. Huber,et al.  Data Analysis: What Can Be Learned From the Past 50 Years , 2011 .

[26]  Alessio Farcomeni,et al.  Robust Methods for Data Reduction , 2015 .

[27]  A. Atkinson,et al.  Finding an unknown number of multivariate outliers , 2009 .

[28]  Francesca Torti,et al.  FSDA: A MATLAB toolbox for robust analysis and interactive data exploration , 2012 .

[29]  David J. Olive,et al.  Inconsistency of Resampling Algorithms for High-Breakdown Regression Estimators and a New Algorithm , 2002 .

[30]  P. Rousseeuw,et al.  Unmasking Multivariate Outliers and Leverage Points , 1990 .

[31]  David R. Cox,et al.  Principles of Applied Statistics , 2011 .

[32]  H. Riedwyl,et al.  Multivariate Statistics: A Practical Approach , 1988 .

[33]  Stephen M. Stigler,et al.  The Changing History of Robustness , 2010 .

[34]  Anthony C. Atkinson,et al.  The forward search: theory and data analysis , 2010 .

[35]  Laura Ventura,et al.  An overview of robust methods in medical research , 2012, Statistical methods in medical research.

[36]  Anthony C. Atkinson,et al.  Robust Diagnostic Regression Analysis , 2000 .

[37]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[38]  Domenico Perrotta,et al.  Robust clustering around regression lines with high density regions , 2013, Advances in Data Analysis and Classification.

[39]  P. Rousseeuw Least Median of Squares Regression , 1984 .