An Analysis of Variance for Categorical Data

Abstract A measure of variation for categorical data is discussed. We develop an analysis of variance for a one-way table, where the response variable is categorical. The data can be viewed alternatively as falling in a two-dimensional contingency table with one margin fixed. Components of variation are derived, and their properties are investigated under a common multinomial model. Using these components, we propose a measure of the variation in the response variable explained by the grouping variable. A test statistic is constructed on the basis of these properties, and its asymptotic behavior under the null hypothesis of independence is studied. Empirical sampling results confirming the asymptotic behavior and investigating power are included.

[1]  W. A. Thompson Applied probability , 1969 .

[2]  J. Gart,et al.  On the bias of various estimators of the logit and its variance with application to quantal bioassay. , 1967, Biometrika.

[3]  Frederick E. Smith,et al.  Matrix Algebra for the Biological Sciences , 1966 .

[4]  W. Hoeffding Asymptotically Optimal Tests for Multinomial Distributions , 1965 .

[5]  D. Rasch Graybill, Franklin A.: An introduction to linear statistical models. Volume I. McGraw Hill, New York‐Toronto‐London 1961; IX + 463 S., 97 s. , 1961 .

[6]  F. Graybill An introduction to linear statistical models , 1961 .

[7]  Bernard G. Greenberg,et al.  Evaluation of Determinants, Characteristic Equations and Their Roots for a Class of Patterned Matrices , 1960 .

[8]  S. Mitra,et al.  AN INTRODUCTION TO SOME NON-PARAMETRIC GENERALIZATIONS OF ANALYSIS OF VARIANCE AND MULTIVARIATE ANALYSIS , 1956 .

[9]  H. D. Patterson,et al.  Analysis of Factorial Arrangements when the Data are Proportions , 1952 .

[10]  W. G. Cochran The comparison of percentages in matched samples. , 1950, Biometrika.

[11]  C. Winsor Factorial analysis of a multiple dichotomy , 1948 .

[12]  C. Gini Variabilita e Mutabilita. , 1913 .

[13]  C. Gini Variabilità e mutabilità : contributo allo studio delle distribuzioni e delle relazioni statistiche , 1912 .

[14]  K. Pearson On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling , 1900 .