Two-sample rank tests under complex sampling

Rank tests are widely used for exploratory and formal inference in the health and social sciences. With the widespread use of data from complex survey samples in medical and social research, there is increasing demand for versions of rank tests that account for the sampling design. We propose a general approach to constructing design-based rank tests when comparing groups within a complex sample and when using a national survey as a reference distribution, and illustrate both scenarios with examples. We show that the tests have asymptotically correct levels and that the relative power of different rank tests is not greatly affected by complex sampling. Copyright 2013, Oxford University Press.

[1]  R. Glynn,et al.  Incorporation of Clustering Effects for the Wilcoxon Rank Sum Test: A Large‐Sample Approach , 2003, Biometrics.

[2]  Sin-Ho Jung,et al.  Rank Tests for Clustered Survival Data , 2003, Lifetime data analysis.

[3]  W. J. Conover,et al.  Rank Tests for One Sample, Two Samples, and $k$ samples Without the Assumption of a Continuous Distribution Function , 1973 .

[4]  A. Scott,et al.  Using multiple frames in health surveys , 2009, Statistics in medicine.

[5]  R. Glynn,et al.  Extension of the Rank Sum Test for Clustered Data: Two‐Group Comparisons with Group Membership Defined at the Subunit Level , 2006, Biometrics.

[6]  B. M. Brown,et al.  Kruskal–Wallis, Multiple Comparisons and Efron Dice , 2002 .

[7]  I. Hertz-Picciotto,et al.  Polybrominated diphenyl ethers in relation to autism and developmental delay: a case-control study , 2011, Environmental health : a global access science source.

[8]  Jianqiang C. Wang,et al.  Sample distribution function based goodness-of-fit test for complex surveys , 2012, Comput. Stat. Data Anal..

[9]  J T Massey,et al.  Plan and operation of the Second National Health and Nutrition Examination Survey, 1976-1980. , 1981, Vital and health statistics. Ser. 1, Programs and collection procedures.

[10]  H. Cardot,et al.  Horvitz--Thompson estimators for functional data: asymptotic confidence bands and optimal allocation for stratified sampling , 2009, 0912.3891.

[11]  S Hulley,et al.  Randomized trial of estrogen plus progestin for secondary prevention of coronary heart disease in postmenopausal women. Heart and Estrogen/progestin Replacement Study (HERS) Research Group. , 1998, JAMA.

[12]  Glen A. Satten,et al.  Midrank unification of rank tests for exact, tied, and censored data , 2002 .

[13]  Charles E. McCulloch,et al.  Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models , 2005 .

[14]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[15]  D. Leece Household Choice of Fixed Versus Floating Rate Debt: A Binomial Probit Model With Correction For Classification Error , 2000 .

[16]  E. Korn,et al.  Inference for Superpopulation Parameters Using Sample Surveys , 2002 .

[17]  M. Kosorok Introduction to Empirical Processes and Semiparametric Inference , 2008 .

[18]  S. Lipsitz,et al.  An extension of the Wilcoxon rank sum test for complex sample survey data , 2012, Journal of the Royal Statistical Society. Series C, Applied statistics.

[19]  Jon A Wellner,et al.  A Z-theorem with Estimated Nuisance Parameters and Correction Note for 'Weighted Likelihood for Semiparametric Models and Two-phase Stratified Samples, with Application to Cox Regression' , 2008, Scandinavian journal of statistics, theory and applications.

[20]  R. Pyke,et al.  Weak Convergence of a Two-sample Empirical Process and a New Approach to Chernoff-Savage Theorems , 1968 .

[21]  P. Sen,et al.  Theory of rank tests , 1969 .

[22]  A. Buist,et al.  The Burden of Obstructive Lung Disease Initiative (BOLD): Rationale and Design , 2005 .

[23]  Somnath Datta,et al.  Rank-Sum Tests for Clustered Data , 2005 .

[24]  J. Meeker,et al.  Serum PCB levels and congener profiles among teachers in PCB-containing schools: a pilot study , 2011, Environmental health : a global access science source.

[25]  D. Mannino,et al.  COPD in Never Smokers , 2011, Chest.

[26]  Thomas Lumley,et al.  Analysis of Complex Survey Samples , 2004 .

[27]  A. Scott,et al.  On Chi-Squared Tests for Multiway Contingency Tables with Cell Proportions Estimated from Survey Data , 1984 .

[28]  A. Ivanova,et al.  The association between serum copper and anaemia in the adult Second National Health and Nutrition Examination Survey (NHANES II) population , 2008, British Journal of Nutrition.

[29]  Edward L. Korn,et al.  Analysis of Health Surveys , 1999 .

[30]  T. Hostetter,et al.  Moderate chronic kidney disease and cognitive function in adults 20 to 59 years of age: Third National Health and Nutrition Examination Survey (NHANES III). , 2007, Journal of the American Society of Nephrology : JASN.

[31]  Sharon L. Lohr,et al.  Inference from Dual Frame Surveys , 2000 .

[32]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[33]  Thomas E. McKone,et al.  Comparison of Current-Use Pesticide and Other Toxicant Urinary Metabolite Levels among Pregnant Women in the CHAMACOS Cohort and NHANES , 2010, Environmental health perspectives.

[34]  P. Rosenfeld,et al.  Dioxin furan blood lipid and attic dust concentrations in populations living near four wood treatment facilities in the United States. , 2011, Journal of environmental health.