Inferences Based on a Skipped Correlation Coefficient

The most popular method for trying to detect an association between two random variables is to test H 0  : ρ=0, the hypothesis that Pearson's correlation is equal to zero. It is well known, however, that Pearson's correlation is not robust, roughly meaning that small changes in any distribution, including any bivariate normal distribution as a special case, can alter its value. Moreover, the usual estimate of ρ, r, is sensitive to only a few outliers which can mask a true association. A simple alternative to testing H 0  : ρ =0 is to switch to a measure of association that guards against outliers among the marginal distributions such as Kendall's tau, Spearman's rho, a Winsorized correlation, or a so-called percentage bend correlation. But it is known that these methods fail to take into account the overall structure of the data. Many measures of association that do take into account the overall structure of the data have been proposed, but it seems that nothing is known about how they might be used to detect dependence. One such measure of association is selected, which is designed so that under bivariate normality, its estimator gives a reasonably accurate estimate of ρ. Then methods for testing the hypothesis of a zero correlation are studied.

[1]  Francisco J. Prieto,et al.  Multivariate Outlier Detection and Robust Covariance Matrix Estimation , 2001, Technometrics.

[2]  Victor J. Yohai,et al.  The Behavior of the Stahel-Donoho Robust Multivariate Estimator , 1995 .

[3]  S. Zani,et al.  Robust bivariate boxplots and multiple outlier detection , 1998 .

[4]  Frederick Mosteller,et al.  Data Analysis and Regression , 1978 .

[5]  S. J. Devlin,et al.  Robust Estimation of Dispersion Matrices and Principal Components , 1981 .

[6]  B. Iglewicz,et al.  Bivariate extensions of the boxplot , 1992 .

[7]  P. Rousseeuw,et al.  High-dimensional computation of the deepest location , 2000 .

[8]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.

[9]  D. Donoho,et al.  Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness , 1992 .

[10]  David M. Rocke Robustness properties of S-estimators of multivariate location and shape in high dimension , 1996 .

[11]  P. Rousseeuw,et al.  Unmasking Multivariate Outliers and Leverage Points , 1990 .

[12]  J. Tukey,et al.  Performance of Some Resistant Rules for Outlier Labeling , 1986 .

[13]  David M. Rocke,et al.  Computable Robust Estimation of Multivariate Location and Shape in High Dimension Using Compound Estimators , 1994 .

[14]  Julia Kastner,et al.  Introduction to Robust Estimation and Hypothesis Testing , 2005 .

[15]  Howard Wainer,et al.  Robust Regression & Outlier Detection , 1988 .

[16]  P. Rousseeuw,et al.  The Bagplot: A Bivariate Boxplot , 1999 .

[17]  David E. Tyler Some Issues in the Robust Estimation of Multivariate Location and Scatter , 1991 .

[18]  F. Mosteller,et al.  Exploring Data Tables, Trends and Shapes. , 1988 .

[19]  R. Maronna Robust $M$-Estimators of Multivariate Location and Scatter , 1976 .

[20]  Guoying Li,et al.  Projection-Pursuit Approach to Robust Dispersion Matrices and Principal Components: Primary Theory and Monte Carlo , 1985 .

[21]  S. Sheather,et al.  Robust Estimation and Testing , 1990 .

[22]  W. Fung,et al.  Unmasking Outliers and Leverage Points: A Confirmation , 1993 .

[23]  H. P. Lopuhaä On the relation between S-estimators and M-estimators of multivariate location and covariance , 1989 .

[24]  David E. Tyler,et al.  Constrained M-estimation for multivariate location and scatter , 1996 .

[25]  M. Srivastava,et al.  On the robustness of the correlation coefficient in sampling from a mixture of two bivariate normals , 1984 .

[26]  David L. Woodruff,et al.  Identification of Outliers in Multivariate Data , 1996 .

[27]  Kenneth Carling,et al.  Resistant outlier rules and the non-Gaussian case , 1998 .

[28]  L. Ammann Robust Singular Value Decompositions: A New Approach to Projection Pursuit , 1993 .

[29]  P. L. Davies,et al.  Asymptotic behaviour of S-estimates of multivariate location parameters and dispersion matrices , 1987 .