On exploratory analytic method for multi-way contingency tables with an ordinal response variable and categorical explanatory variables

Abstract In this paper, we propose a new model-free exploratory method for descriptive modeling that identifies and measures the regression dependence between an ordinal response variable and categorical (ordinal or nominal) explanatory variables in a multi-way contingency table. The proposed methodology consists of three parts, checkerboard copula score, checkerboard copula regression, and checkerboard copula association measure. The checkerboard copula score is a new type of score for ordinal variables that preserves the natural ordering of the categorical scale and it will be exploited for developing the methods measuring the association between the variables of interest. The checkerboard copula regression identifies the regression dependence between an ordinal response variable and categorical explanatory variables. It enables delineating the identified dependence in an exploratory manner. The checkerboard copula association measure quantifies the strength of the dependence identified by the checkerboard copula regression. We investigate the properties of checkerboard copula scores, checkerboard copula regression, its association measure, and their estimators. Finally, the performance of the proposed method is illustrated with simulation and real data.

[1]  L. A. Goodman The analysis of dependence in cross-classifications having ordered categories, using log-linear models for frequencies and log-linear models for odds. , 1983, Biometrics.

[2]  J. N. S. Matthews,et al.  The equivalence of two models for ordinal data , 1985 .

[3]  Gary Simon Alternative Analyses for the Singly-Ordered Contingency Table , 1974 .

[4]  Gerhard Tutz,et al.  Regression for Categorical Data , 2011 .

[5]  Shelby J. Haberman,et al.  The analysis of multivariate contingency tables by restricted canonical and restricted association models , 1988 .

[6]  Leo A. Goodman,et al.  Some Useful Extensions of the Usual Correspondence Analysis Approach and the Usual Log-Linear Models Approach in the Analysis of Contingency Tables , 1986 .

[7]  John Aitchison,et al.  THE GENERALIZATION OF PROBIT ANALYSIS TO THE CASE OF MULTIPLE RESPONSES , 1957 .

[8]  P. McCullagh Regression Models for Ordinal Data , 1980 .

[9]  L. A. Goodman Simple Models for the Analysis of Association in Cross-Classifications Having Ordered Categories , 1979 .

[10]  Maria Kateri,et al.  Contingency Table Analysis: Methods and Implementation Using R , 2014 .

[11]  Alan Agresti,et al.  Categorical Data Analysis , 2003 .

[12]  J. Anderson Regression and Ordered Categorical Variables , 1984 .

[13]  E. Frees,et al.  Nonparametric Estimation of Copula Regression Models With Discrete Outcomes , 2020, Journal of the American Statistical Association.

[14]  Roland K. Hawkes,et al.  The Multivariate Analysis of Ordinal Measures , 1971, American Journal of Sociology.

[15]  P. Mielke,et al.  Permutation Methods: A Distance Function Approach , 2007 .

[16]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[17]  Stephen E. Fienberg,et al.  Positive dependence concepts for ordinal contingency tables , 1990 .

[18]  T. Yee The VGAM Package for Categorical Data Analysis , 2010 .

[19]  L. A. Goodman A Single General Method for the Analysis of Cross-Classified Data: Reconciliation and Synthesis of Some Methods of Pearson, Yule, and Fisher, and Also Some Methods of Correspondence Analysis and Association Analysis , 1996 .

[20]  P. Good,et al.  Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses , 1995 .

[21]  V. T. Farewell,et al.  A note on regression analysis of ordinal data with variability of classification , 1982 .

[22]  D. Donoho 50 Years of Data Science , 2017 .

[23]  Gary G. Koch,et al.  Average Partial Association in Three-way Contingency Tables: a Review and Discussion of Alternative Tests , 1978 .

[24]  Johanna Nešlehová,et al.  On rank correlation measures for non-continuous random variables , 2007 .

[25]  Michael Smithson,et al.  Generalized Linear Models for Categorical and Continuous Limited Dependent Variables , 2013 .

[26]  Michel Denuit,et al.  Constraints on concordance measures in bivariate discrete data , 2005 .

[27]  Z. Gilula,et al.  Inferential Ordinal Correspondence Analysis: Motivation, Derivation and Limitations , 1990 .

[28]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[29]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[30]  Robert H. Somers,et al.  A new asymmetric measure of association for ordinal variables. , 1962 .

[31]  Christian Genest,et al.  On the empirical multilinear copula process for count data , 2014, 1407.1200.

[32]  Harry Joe,et al.  Assessing Approximate Fit in Categorical Data Analysis , 2014, Multivariate behavioral research.

[33]  R. Nelsen An Introduction to Copulas , 1998 .

[34]  Nathan Mantel,et al.  Chi-square tests with one degree of freedom , 1963 .

[35]  B. Schweizer,et al.  Operations on distribution functions not derivable from operations on random variables , 1974 .

[36]  M. Kendall The treatment of ties in ranking problems. , 1945, Biometrika.

[37]  Roger W. Johnson,et al.  An Introduction to the Bootstrap , 2001 .

[38]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[39]  H. Joe Dependence Modeling with Copulas , 2014 .

[40]  A. Agresti,et al.  The analysis of ordered categorical data: An overview and a survey of recent developments , 2005 .

[41]  A. Stuart,et al.  The Estimation and Comparison of Strengths of Association in Contingency Tables , 1953 .

[42]  M. Kendall Rank Correlation Methods , 1949 .

[43]  B. Shepherd,et al.  Test of Association Between Two Ordinal Variables While Adjusting for Covariates , 2010, Journal of the American Statistical Association.

[44]  Bruno Rémillard,et al.  Asymptotic behavior of the empirical multilinear copula process under broad conditions , 2017, J. Multivar. Anal..

[45]  L. A. Goodman The Analysis of Cross-Classified Data Having Ordered and/or Unordered Categories: Association Models, Correlation Models, and Asymmetry Models for Contingency Tables With or Without Missing Entries , 1985 .

[46]  M. Sklar Fonctions de repartition a n dimensions et leurs marges , 1959 .

[47]  Andrew S. Fullerton,et al.  Ordered Regression Models: Parallel, Partial, and Non-Parallel Alternatives , 2016 .

[48]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[49]  A. Agresti Analysis of Ordinal Categorical Data , 1985 .