Permutation tests for equality of distributions in high‐dimensional settings

Motivated by applications in high-dimensional settings, we suggest a test of the hypothesis H-sub-0 that two sampled distributions are identical. It is assumed that two independent datasets are drawn from the respective populations, which may be very general. In particular, the distributions may be multivariate or infinite-dimensional, in the latter case representing, for example, the distributions of random functions from one Euclidean space to another. Our test uses a measure of distance between data. This measure should be symmetric but need not satisfy the triangle inequality, so it is not essential that it be a metric. The test is based on ranking the pooled dataset, with respect to the distance and relative to any fixed data value, and repeating this operation for each fixed datum. A permutation argument enables a critical point to be chosen such that the test has concisely known significance level, conditional on the set of all pairwise distances. Copyright Biometrika Trust 2002, Oxford University Press.

[1]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[2]  T. W. Anderson,et al.  Asymptotic Theory of Certain "Goodness of Fit" Criteria Based on Stochastic Processes , 1952 .

[3]  M. Rosenblatt,et al.  Limit Theorems Associated with Variants of the Von Mises Statistic , 1952 .

[4]  Kanti V. Mardia,et al.  A Non‐Parametric Test for the Bivariate Two‐Sample Location Problem , 1967 .

[5]  P. Bickel A Distribution Free Version of the Smirnov Two Sample Test in the $p$-Variate Case , 1969 .

[6]  Jon A. Wellner,et al.  Permutation Tests for Directional Data , 1979 .

[7]  J. Friedman,et al.  Multivariate generalizations of the Wald--Wolfowitz and Smirnov two-sample tests , 1979 .

[8]  B. M. Brown,et al.  Cramer-von Mises distributions and permutation tests , 1982 .

[9]  Hannu Oja On Permutation Tests in Multiple Regression and Analysis of Covariance Problems , 1987 .

[10]  B. Putten On the construction of multivariate permutation tests in the two–sample case , 1987 .

[11]  N. Henze A MULTIVARIATE TWO-SAMPLE TEST BASED ON THE NUMBER OF NEAREST NEIGHBOR TYPE COINCIDENCES , 1988 .

[12]  R. Y. Liu,et al.  On a notion of simplicial depth. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Regina Y. Liu On a Notion of Data Depth Based on Random Simplices , 1990 .

[14]  Regina Y. Liu,et al.  Ordering directional data: concepts of data depth on circles and spheres , 1992 .

[15]  L. Dümbgen Limit theorems for the simplicial depth , 1992 .

[16]  D. Donoho,et al.  Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness , 1992 .

[17]  Regina Y. Liu,et al.  A Quality Index Based on Data Depth and Multivariate Rank Tests , 1993 .

[18]  P. Good,et al.  Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses , 1995 .

[19]  Neville Nicholls,et al.  A historical annual temperature dataset for Australia , 1996 .

[20]  Y. Lepage,et al.  A rank test for bivariate location and scale problem for elliptically symmetric populations , 1996 .

[21]  Regina Y. Liu,et al.  Notions of Limiting P Values Based on Data Depth and Bootstrap , 1997 .

[22]  Thomas P. Hettmansperger,et al.  Generalised weighted Cramer-von Mises distance estimators , 1997 .

[23]  A. B. Yeh,et al.  Balanced Confidence Regions Based on Tukey’s Depth and the Bootstrap , 1997 .

[24]  Hannu Oja,et al.  AFFINE INVARIANT MULTIVARIATE RANK TESTS FOR SEVERAL SAMPLES , 1998 .

[25]  Regina Y. Liu,et al.  Multivariate analysis by data depth: descriptive statistics, graphics and inference, (with discussion and a rejoinder by Liu and Singh) , 1999 .

[26]  Regina Y. Liu,et al.  Regression depth. Commentaries. Rejoinder , 1999 .

[27]  Lixing Zhu,et al.  Permutation Tests for Multivariate Location Problems , 1999 .

[28]  Cun-Hui Zhang,et al.  The multivariate L1-median and associated data depth. , 2000, Proceedings of the National Academy of Sciences of the United States of America.