Statistical Distances and Their Role in Robustness

Statistical distances, divergences, and similar quantities have a large history and play a fundamental role in statistics, machine learning and associated scientific disciplines. However, within the statistical literature, this extensive role has too often been played out behind the scenes, with other aspects of the statistical problems being viewed as more central, more interesting, or more important. The behind the scenes role of statistical distances shows up in estimation, where we often use estimators based on minimizing a distance, explicitly or implicitly, but rarely studying how the properties of a distance determine the properties of the estimators. Distances are also prominent in goodness-of-fit, but the usual question we ask is "how powerful is this method against a set of interesting alternatives" not "what aspect of the distance between the hypothetical model and the alternative are we measuring?" Our focus is on describing the statistical properties of some of the distance measures we have found to be most important and most visible. We illustrate the robust nature of Neyman's chi-squared and the non-robust nature of Pearson's chi-squared statistics and discuss the concept of discretization robustness.

[1]  J. Neyman,et al.  Contribution to the Theory of the {χ superscript 2} Test , 1949 .

[2]  H. Vincent Poor,et al.  Robust decision design using a distance criterion , 1980, IEEE Trans. Inf. Theory.

[3]  B. Lindsay Efficiency versus robustness : the case for minimum Hellinger distance and related methods , 1994 .

[4]  C. R. Rao,et al.  Diversity: its measurement, decomposition, apportionment and analysis , 1982 .

[5]  K. Matusita Decision Rules, Based on the Distance, for Problems of Fit, Two Samples, and Estimation , 1955 .

[6]  W. W. Muir,et al.  Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1980 .

[7]  Marianthi Markatou A Closer Look at Weighted Likelihood in the Context of Mixtures , 2000 .

[8]  B. Lindsay,et al.  Weighted likelihood estimating equations: The discrete case with applications to logistic regression , 1997 .

[9]  M. Markatou Mixture Models, Robustness, and the Weighted Likelihood Methodology , 2000, Biometrics.

[10]  R. Beran Minimum Hellinger distance estimates for parametric models , 1977 .

[11]  Marianthi Markatou,et al.  Quadratic distances on probabilities: A unified foundation , 2008, 0804.0991.

[12]  Marianthi Markatou,et al.  Weighted Likelihood Equations with Bootstrap Root Search , 1998 .

[13]  Calyampudi R. Rao Criteria of estimation in large samples , 1965 .

[14]  B. Lindsay,et al.  Minimum disparity estimation for continuous models: Efficiency, distributions and robustness , 1994 .

[15]  Laurie Davies,et al.  The Identification of Multiple Outliers: Rejoinder , 1993 .

[16]  Laurie Davies,et al.  The identification of multiple outliers , 1993 .

[17]  C. D. Beaumont,et al.  Regression Diagnostics — Identifying Influential Data and Sources of Collinearity , 1981 .

[18]  J. Berkson MINIMUM CHI-SQUARE, NOT MAXIMUM LIKELIHOOD! , 1980 .

[19]  David Lindley,et al.  Statistical Decision Functions , 1951, Nature.

[20]  Bruce G. Lindsay Statistical Distances as Loss Functions in Assessing Model Adequacy , 2004 .

[21]  S. Chatterjee,et al.  Influential Observations, High Leverage Points, and Outliers in Linear Regression , 1986 .

[22]  Marianthi Markatou,et al.  Kernels, Degrees of Freedom, and Power Properties of Quadratic Distance Goodness-of-Fit Tests , 2014, Journal of the American Statistical Association.