Data Information in Contingency Tables: A Fallacy of Hierarchical Loglinear Models

Information identities derived from entropy and relative entropy can be useful in statistical inference. For discrete data analyses, a recent study by the authors showed that the fundamental likelihood structure with categorical variables can be expressed in different yet equivalent information decompositions in terms of relative entropy. This clarifies an essential differ- ence between the classical analysis of variance and the analysis of discrete data, revealing a fallacy in the analysis of hierarchical loglinear models. The discussion here is focused on the likelihood information of a three-way con- tingency table, without loss of generality. A classical three-way categorical data example is examined to illustrate the findings.

[1]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[2]  M. Bartlett Contingency Table Interactions , 1935 .

[3]  H. W. Norton Calculation of Chi-Square for Complex Contingency Tables , 1945 .

[4]  Marvin A. Kastenbaum,et al.  On the Hypothesis of No "Interaction" In a Multi-way Contingency Table , 1956 .

[5]  P. Holland,et al.  Discrete Multivariate Analysis. , 1976 .

[6]  Leo A. Goodman,et al.  Analyzing Qualitative Categorical Data. , 1979 .

[7]  L. A. Goodman On Partitioning χ2 and Detecting Partial Association in Three‐Way Contingency Tables , 1969 .

[8]  G. Reinsel,et al.  Multivariate Reduced-Rank Regression: Theory and Applications , 1998 .

[9]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[10]  J. Schnute Linear Mixtures: A New Approach to Bivariate Trend Lines , 1984 .

[11]  T. W. Anderson Estimating Linear Restrictions on Regression Coefficients for Multivariate Normal Distributions , 1951 .

[12]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[13]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[14]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[15]  J. Wolfowitz,et al.  Introduction to the Theory of Statistics. , 1951 .

[16]  M. W. Birch Maximum Likelihood in Three-Way Contingency Tables , 1963 .

[17]  W. G. Cochran Some Methods for Strengthening the Common χ 2 Tests , 1954 .

[18]  C. Blyth On Simpson's Paradox and the Sure-Thing Principle , 1972 .

[19]  W. Haenszel,et al.  Statistical aspects of the analysis of data from retrospective studies of disease. , 1959, Journal of the National Cancer Institute.

[20]  B. Flury Common Principal Components and Related Multivariate Models , 1988 .

[21]  B. Woolf ON ESTIMATING THE RELATION BETWEEN BLOOD GROUP AND DISEASE , 1955, Annals of human genetics.

[22]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[23]  H. O. Lancaster Complex Contingency Tables Treated by the Partition of χ2 , 1951 .

[24]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[25]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[26]  S. S. Wilks The Likelihood Test of Independence in Contingency Tables , 1935 .

[27]  Pierre Jolicoeur,et al.  The multivariate generalization of the allometry equation , 1963 .

[28]  E. H. Simpson,et al.  The Interpretation of Interaction in Contingency Tables , 1951 .

[29]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[30]  Steven M. Lalonde,et al.  A First Course in Multivariate Statistics , 1997, Technometrics.

[31]  T. W. Anderson Asymptotic Theory for Canonical Correlation Analysis , 1999 .

[32]  B. N. Lewis On the Analysis of Interaction in Multi‐Dimensional Contingency Tables , 1962 .

[33]  J. Wolfowitz,et al.  An Introduction to the Theory of Statistics , 1951, Nature.

[34]  Leo A. Goodman,et al.  Simple Methods for Analyzing Three-Factor Interaction in Contingency Tables , 1964 .

[35]  P. Claringbold THE USE OF ORTHOGONAL POLYNOMIALS IN THE PARTITION OF CHI‐SQUARE , 1961 .

[36]  J. Mosimann Size Allometry: Size and Shape Variables with Characterizations of the Lognormal and Generalized Gamma Distributions , 1970 .

[37]  A Multivariate Comparison of Allometric Growth Patterns , 1991 .

[38]  R. Gray Entropy and Information Theory , 1990, Springer New York.

[39]  Maurice G. Kendall,et al.  The advanced theory of statistics , 1945 .

[40]  Calyampudi R. Rao,et al.  Linear statistical inference and its applications , 1965 .

[41]  Calyampudi Radhakrishna Rao,et al.  Linear Statistical Inference and its Applications , 1967 .

[42]  Ronald Christensen,et al.  Log-Linear Models and Logistic Regression , 1997 .

[43]  W. Deming,et al.  On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known , 1940 .

[44]  John J. Gart,et al.  On the Combination of Relative Risks , 1962 .

[45]  J. Darroch Interactions in Multi‐Factor Contingency Tables , 1962 .

[46]  Susan R. Wilson,et al.  Two guidelines for bootstrap hypothesis testing , 1991 .

[47]  R. Plackett A Note on Interactions in Contingency Tables , 1962 .

[48]  J. Hagenaars Loglinear Models with Latent Variables , 1993 .

[49]  Arthur C. Tsai,et al.  Information identities and testing hypotheses: Power analysis for contingency tables. , 2008 .