A Class of Association Measures for Categorical Variables Based on Weighted Minkowski Distance

Measuring and testing association between categorical variables is one of the long-standing problems in multivariate statistics. In this paper, I define a broad class of association measures for categorical variables based on weighted Minkowski distance. The proposed framework subsumes some important measures including Cramér’s V, distance covariance, total variation distance and a slightly modified mean variance index. In addition, I establish the strong consistency of the defined measures for testing independence in two-way contingency tables, and derive the scaled forms of unweighted measures.

[1]  H. Theil On the Estimation of Relationships Involving Qualitative Variables , 1970, American Journal of Sociology.

[2]  Kenji Fukumizu,et al.  Equivalence of distance-based and RKHS-based statistics in hypothesis testing , 2012, ArXiv.

[3]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[4]  Runze Li,et al.  Model-Free Feature Screening for Ultrahigh Dimensional Discriminant Analysis , 2015, Journal of the American Statistical Association.

[5]  Qingyang Zhang,et al.  Testing conditional independence and homogeneity in large sparse three‐way tables using conditional distance covariance , 2019, Stat.

[6]  László Györfi,et al.  On the asymptotic properties of a nonparametric L/sub 1/-test statistic of homogeneity , 2005, IEEE Transactions on Information Theory.

[7]  J. Neyman,et al.  Principles of the mathematical theory of correlation , 1939 .

[8]  Qingyang Zhang Independence test for large sparse contingency tables based on distance correlation , 2019, Statistics & Probability Letters.

[9]  Gert R. G. Lanckriet,et al.  On the empirical estimation of integral probability metrics , 2012 .

[10]  Brendan McCane,et al.  Distance functions for categorical and mixed variables , 2008, Pattern Recognit. Lett..

[11]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[12]  J. Michael Herrmann,et al.  Lagged correlation-based deep learning for directional trend change prediction in financial time series , 2018, Expert Syst. Appl..

[13]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[14]  Claudia Biermann,et al.  Mathematical Methods Of Statistics , 2016 .

[15]  David Thomas,et al.  The Art in Computer Programming , 2001 .