Marginal models for categorical data

Statistical models defined by imposing restrictions on marginal distributions of contingency tables have received considerable attention recently. This paper introduces a general definition of marginal log-linear parameters and describes conditions for a marginal log-linear parameter to be a smooth parameterization of the distribution and to be variation independent. Statistical models defined by imposing affine restrictions on the marginal log-linear parameters are investigated. These models generalize ordinary log-linear and multivariate logistic models. Sufficient conditions for a log-affine marginal model to be nonempty and to be a curved exponential family are given. Standard large-sample theory is shown to apply to maximum likelihood estimation of log-affine marginal models for a variety of sampling procedures.

[1]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[2]  A. Wald Tests of statistical hypotheses concerning several parameters when the number of observations is large , 1943 .

[3]  M. Kendall The treatment of ties in ranking problems. , 1945, Biometrika.

[4]  Q. Mcnemar Note on the sampling error of the difference between correlated proportions or percentages , 1947, Psychometrika.

[5]  Jerzy Neyman,et al.  Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability : held at the Statistical Laboratory, Department of Mathematics, University of California, August 13-18, 1945, January 27-29, 1946 , 1949 .

[6]  A. Stuart A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION , 1955 .

[7]  J. Aitchison,et al.  Maximum-Likelihood Estimation of Parameters Subject to Restraints , 1958 .

[8]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[9]  S. D. Silvey,et al.  The Lagrangian Multiplier Test , 1959 .

[10]  S. D. Silvey,et al.  Maximum-Likelihood Estimation Procedures and Associated Tests of Significance , 1960 .

[11]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[12]  Robert H. Somers,et al.  A new asymmetric measure of association for ordinal variables. , 1962 .

[13]  John Aitchison,et al.  Large‐Sample Restricted Parametric Tests , 1962 .

[14]  A. Madansky TESTS OF HOMOGENEITY FOR CORRELATED SAMPLES , 1963 .

[15]  M. W. Birch Maximum Likelihood in Three-Way Contingency Tables , 1963 .

[16]  Henri Caussinus,et al.  Contribution à l'analyse statistique des tableaux de corrélation , 1965 .

[17]  V. P. Bhapkar A Note on the Equivalence of Two Test Criteria for Hypotheses in Categorical Data , 1966 .

[18]  J. Fleiss,et al.  Quantification of agreement in psychiatric diagnosis. A new approach. , 1967, Archives of general psychiatry.

[19]  B. M. Bennett Tests of Hypotheses Concerning Matched Samples , 1967 .

[20]  S. Kullback,et al.  Minimum discrimination information estimation. , 2006, Biometrics.

[21]  G. Koch,et al.  Analysis of categorical data by linear models. , 1969, Biometrics.

[22]  O. Barndorff-Nielsen Information And Exponential Families , 1970 .

[23]  B. Everitt,et al.  COMPARING THE MARGINAL TOTALS OF SQUARE CONTINGENCY TABLES , 1971 .

[24]  J. Fryer On the Homogeneity of the Marginal Distributions of a Multidimensional Contingency Table , 1971 .

[25]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[26]  Leo A. Goodman,et al.  Causal Analysis of Data from Panel Studies and Other Kinds of Surveys , 1973, American Journal of Sociology.

[27]  Gary Simon,et al.  Additivity of Information in Exponential Family Probability Laws , 1973 .

[28]  S. Haberman The Analysis of Residuals in Cross-Classified Tables , 1973 .

[29]  Leo A. Goodman,et al.  The analysis of multidimensional contingency tables when some variables are posterior to others: a modified path analysis approach , 1973 .

[30]  G G Koch,et al.  An analysis for compounded functions of categorical data. , 1973, Biometrics.

[31]  Jacob Cohen,et al.  The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability , 1973 .

[32]  R. W. Wedderburn Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method , 1974 .

[33]  D. Clayton Some odds ratio statistics for the analysis of ordered categorical data , 1974 .

[34]  S. Haberman,et al.  The analysis of frequency data , 1974 .

[35]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .

[36]  R. Jennrich,et al.  MAXIMUM LIKELIHOOD ESTIMATION BY MEANS OP NONLINEAR LEAST SQUARES , 1975 .

[37]  P. Holland,et al.  Discrete Multivariate Analysis. , 1976 .

[38]  Shelby J. Haberman,et al.  Log-Linear Models and Frequency Tables with Small Expected Cell Counts , 1977 .

[39]  A. M. Mathai,et al.  Tests of Statistical Hypotheses , 1977 .

[40]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[41]  Herbert M. Kritzer,et al.  Analyzing Measures of Association Derived From Contingency Tables , 1977 .

[42]  John Bibby,et al.  The Analysis of Contingency Tables , 1978 .

[43]  K. Larntz Small-Sample Comparisons of Exact Levels for Chi-Squared Goodness-of-Fit Statistics , 1978 .

[44]  S. Haberman Analysis of qualitative data , 1978 .

[45]  Stephen E. Fienberg,et al.  The analysis of cross-classified categorical data , 1980 .

[46]  A. Whittemore Collapsibility of Multidimensional Contingency Tables , 1978 .

[47]  Oscar Kempthorne,et al.  In dispraise of the exact test: reactions☆ , 1979 .

[48]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[49]  L. A. Goodman Simple Models for the Analysis of Association in Cross-Classifications Having Ordered Categories , 1979 .

[50]  P. McCullagh Regression Models for Ordinal Data , 1980 .

[51]  A. Dawid Conditional Independence for Statistical Operations , 1980 .

[52]  J. Wahrendorf Inference in contingency tables with ordered categories using Plackett's coefficient of association for bivariate distributions , 1980 .

[53]  K. Koehler,et al.  An Empirical Investigation of Goodness-of-Fit Statistics for Sparse Multinomials , 1980 .

[54]  O. D. Duncan Testing Key Hypotheses in Panel Analysis , 1980 .

[55]  J. Pratt Concavity of the Log Likelihood , 1981 .

[56]  David Knoke,et al.  Analysis of Qualitative Data, Vol. 2: New Developments. , 1981 .

[57]  J. Burridge,et al.  A Note on Maximum Likelihood Estimation for Regression Models using Grouped Data , 1981 .

[58]  O. D. Duncan Two Faces of Panel Analysis: Parallels with Comparative Cross-Sectional Analysis and Time-Lagged Association , 1981 .

[59]  Patrick Doreian,et al.  Maximum Likelihood Methods for Linear Models , 1982 .

[60]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[61]  S. R. Searle,et al.  Matrix Algebra Useful for Statistics , 1982 .

[62]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[63]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[64]  P. Green Iteratively reweighted least squares for maximum likelihood estimation , 1984 .

[65]  Peter J. Diggle,et al.  Computing in statistical science through APL , 1981, Springer Series in Statistics.

[66]  P. M. E. Altham,et al.  Improving the Precision of Estimation by Fitting a Model , 1984 .

[67]  B. Jørgensen The Delta Algorithm and GLIM , 1984 .

[68]  J. Krauth A Comparison of Tests for Marginal Homogeneity in Square Contingency Tables , 1985 .

[69]  Michael E. Sobel,et al.  Exchange, Structure, and Symmetry in Occupational Mobility , 1985, American Journal of Sociology.

[70]  M. Haber Maximum likelihood methods for linear and log-linear models in categorical data , 1985 .

[71]  A. Agresti,et al.  Analysis of Ordinal Categorical Data. , 1985 .

[72]  D. Pierce,et al.  Residuals in Generalized Linear Models , 1986 .

[73]  J. Dale Global cross-ratio models for bivariate, discrete, ordered responses. , 1986, Biometrics.

[74]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[75]  J. E. Jackson The Analysis of Cross-Classified Data Having Ordered Categories , 1986 .

[76]  K. Koehler Goodness-of-fit tests for log-linear models in sparse contingency tables , 1986 .

[77]  D. Cox,et al.  Parameter Orthogonality and Approximate Conditional Inference , 1987 .

[78]  J. Fleiss,et al.  Quantification of agreement in psychiatric diagnosis revisited. , 1987, Archives of general psychiatry.

[79]  R. Alba Interpreting the Parameters of Log-Linear Models , 1987 .

[80]  Alan Agresti,et al.  An empirical investigation of some effects of sparseness in contingency tables , 1987 .

[81]  M. Sobel Some Models for the Multiway Contingency Table with a One-to-One Correspondence among Categories , 1988 .

[82]  T. Speed,et al.  On the Existence of Maximum Likelihood Estimators for Hierarchical Loglinear Models , 1988 .

[83]  I. Csiszár A geometric interpretation of Darroch and Ratcliff's generalized iterative scaling , 1989 .

[84]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[85]  D. Firth Marginal homogeneity and the superposition of Latin squares , 1989 .

[86]  S. Lipsitz,et al.  Finding the design matrix for the marginal homogeneity model , 1990 .

[87]  F. T. Wright,et al.  Order restricted statistical inference , 1988 .

[88]  N M Laird,et al.  Maximum likelihood regression methods for paired binary data. , 1990, Statistics in medicine.

[89]  A. Agresti An introduction to categorical data analysis , 1997 .

[90]  Jacques A. Hagenaars,et al.  Categorical Longitudinal Data. , 1991 .

[91]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[92]  A. Agresti [A Survey of Exact Inference for Contingency Tables]: Rejoinder , 1992 .

[93]  S. Zeger,et al.  Multivariate Regression Analyses for Categorical Data , 1992 .

[94]  S. Lipsitz,et al.  Comparing marginal distributions of large, sparse contingency tables , 1992 .

[95]  Andrea Rotnitzky,et al.  Regression Models for Discrete Longitudinal Responses , 1993 .

[96]  M P Becker,et al.  Marginal modeling of binary cross-over data. , 1993, Biometrics.

[97]  Alan Agresti,et al.  A proportional odds model with subject-specific effects for repeated ordered categorical responses , 1993 .

[98]  J. Lindsey Models for Repeated Measurements , 1993 .

[99]  N. Laird,et al.  A likelihood-based method for analysing longitudinal binary responses , 1993 .

[100]  Alan Agresti,et al.  Statistical models for ordinal variables , 1994 .

[101]  M. Becker,et al.  ANALYSIS OF CROSS- CLASSIFICATIONS OF COUNTS USING MODELS FOR MARGINAL DISTRIBUTIONS: AN APPLICATION TO TRENDS IN ATTITUDES ON LEGALIZED ABORTION , 1994 .

[102]  A. Agresti,et al.  Simultaneously Modeling Joint and Marginal Distributions of Multivariate Categorical Responses , 1994 .

[103]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[104]  Mark Von Tress Statistical Models for Ordinal Variables , 1995 .

[105]  P. McCullagh,et al.  Multivariate Logistic Models , 1995 .

[106]  P. Diggle Analysis of Longitudinal Data , 1995 .

[107]  Joseph B. Lang,et al.  On the comparison of multinomial and Poisson log-linear models , 1996 .

[108]  Stephen Wolfram,et al.  The Mathematica Book , 1996 .

[109]  Joseph B. Lang,et al.  Maximum likelihood methods for a generalized class of log-linear models , 1996 .

[110]  Steffen L. Lauritzen,et al.  Graphical models in R , 1996 .

[111]  G. Glonek A class of regression models for multivariate categorical responses , 1996 .

[112]  J. Lang On the Partitioning of Goodness-of-Fit Statistics for Multivariate Categorical Response Models , 1996 .

[113]  Jeroen K. Vermunt,et al.  Log-Linear Models for Event Histories , 1997 .

[114]  Ilya Segal,et al.  Solutions manual for Microeconomic theory : Mas-Colell, Whinston and Green , 1997 .

[115]  Göran Kauermann,et al.  A NOTE ON MULTIVARIATE LOGISTIC MODELS FOR CONTINGENCY TABLES , 1997 .

[116]  A. Rukhin Bayes and Empirical Bayes Methods for Data Analysis , 1997 .

[117]  R. Colombi A multivariate logit model with marginal canonical association , 1998 .

[118]  G. Molenberghs,et al.  Marginal modelling of multivariate categorical data. , 1999, Statistics in medicine.

[119]  J. Hagenaars,et al.  Analyzing Change in Categorical Variables by Generalized Log-Linear Models , 2000 .

[120]  A. Forcina,et al.  Marginal regression models for the analysis of positive association of ordinal response variables , 2001 .