Testing for conditional multiple marginal independence.

Survey respondents are often prompted to pick any number of responses from a set of possible responses. Categorical variables that summarize this kind of data are called pick any/c variables. Counts from surveys that contain a pick any/c variable along with a group variable (r levels) and stratification variable (q levels) can be marginally summarized into an r x c x q contingency table. A question that may naturally arise from this setup is to determine if the group and pick any/c variable are marginally independent given the stratification variable. A test for conditional multiple marginal independence (CMMI) can be used to answer this question. Since subjects may pick any number out of c possible responses, the Cochran (1954, Biometrics 10, 417-451) and Mantel and Haenszel (1959, Journal of the National Cancer Institute 22, 719-748) tests cannot be used directly because they assume that units in the contingency table are independent of each other. Therefore, new testing methods are developed. Cochran's test statistic is extended to r x 2 x q tables, and a modified version of this statistic is proposed to test CMMI. Its sampling distribution can be approximated through bootstrapping. Other CMMI testing methods discussed are bootstrap p-value combination methods and Bonferroni adjustments. Simulation findings suggest that the proposed bootstrap procedures and the Bonferroni adjustments consistently hold the correct size and provide power against various alternatives.

[1]  W. G. Cochran Some Methods for Strengthening the Common χ 2 Tests , 1954 .

[2]  J. Koopman,et al.  Condom Use and First‐Time Urinary Tract Infection , 1997, Epidemiology.

[3]  S. Gange Generating Multivariate Categorical Variates Using the Iterative Proportional Fitting Algorithm , 1995 .

[4]  Thomas M. Loughin,et al.  Testing for Association in Contingency Tables with Multiple Column Responses , 1998 .

[5]  P. Good,et al.  A Simple Test , 1994 .

[6]  A. M. Mathai Quadratic forms in random variables , 1992 .

[7]  T M Loughin,et al.  On the first-order Rao-Scott correction of the Umesh-Loughin-Scherer statistic. , 2001, Biometrics.

[8]  W. Haenszel,et al.  Statistical aspects of the analysis of data from retrospective studies of disease. , 1959, Journal of the National Cancer Institute.

[9]  Alan Agresti,et al.  Strategies for Modeling a Categorical Variable Allowing Multiple Category Choices , 2001 .

[10]  A. Scott,et al.  The Analysis of Categorical Data from Complex Sample Surveys: Chi-Squared Tests for Goodness of Fit and Independence in Two-Way Tables , 1981 .

[11]  L. Tippett,et al.  The Methods of Statistics. , 1933 .

[12]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[13]  A Agresti,et al.  Modeling a Categorical Variable Allowing Arbitrarily Many Category Choices , 1999, Biometrics.

[14]  Gary G. Koch,et al.  Average Partial Association in Three-way Contingency Tables: a Review and Discussion of Alternative Tests , 1978 .

[15]  D. R. Thomas,et al.  A Simple Test of Association for Contingency Tables with Multiple Column Responses , 2000, Biometrics.

[16]  U. Umesh Predicting nominal variable relationships with multiple response , 1995 .

[17]  C. Coombs A theory of data. , 1965, Psychology Review.

[18]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[19]  Dan Nettleton,et al.  Multiple Marginal Independence Testing for Pick Any/C Variables , 2000 .