Maximum likelihood estimation in log-linear models

We study maximum likelihood estimation in log-linear models under conditional Poisson sampling schemes. We derive necessary and sufficient conditions for existence of the maximum likelihood estimator (MLE) of the model parameters and investigate estimability of the natural and mean-value parameters under a nonexistent MLE. Our conditions focus on the role of sampling zeros in the observed table. We situate our results within the framework of extended exponential families, and we exploit the geometric properties of log-linear models. We propose algorithms for extended maximum likelihood estimation that improve and correct the existing algorithms for log-linear model analysis.

[1]  Jason Morton,et al.  Relations among conditional probabilities , 2008, J. Symb. Comput..

[2]  Ruth King,et al.  Prior induction in log-linear models for general contingency table analysis , 2001 .

[3]  Joseph B. Lang,et al.  Multinomial-Poisson homogeneous models for contingency tables , 2003 .

[4]  Y. Bishop,et al.  Full Contingency Tables, Logits, and Split Contingency Tables , 1969 .

[5]  H. Wynn,et al.  Algebraic Statistics: Computational Commutative Algebra in Statistics , 2000 .

[6]  B. Sturmfels Gröbner bases and convex polytopes , 1995 .

[7]  D. Hunter MM algorithms for generalized Bradley-Terry models , 2003 .

[8]  Akimichi Takemura,et al.  Iterative proportional scaling via decomposable submodels for contingency tables , 2006, Comput. Stat. Data Anal..

[9]  I. Csiszár,et al.  Closures of exponential families , 2005, math/0503653.

[10]  L. Brown Fundamentals of statistical exponential families: with applications in statistical decision theory , 1986 .

[11]  J. Chimka Categorical Data Analysis, Second Edition , 2003 .

[12]  Stephen E. Fienberg,et al.  Maximum Likelihood Estimation in Log-Linear Models Supplementary Material , 2006 .

[13]  I. Csiszár A geometric interpretation of Darroch and Ratcliff's generalized iterative scaling , 1989 .

[14]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[15]  S. Fienberg,et al.  The Geometry of a Two by Two Contingency Table , 1970 .

[16]  Michael Joswig,et al.  polymake: a Framework for Analyzing Convex Polytopes , 2000 .

[17]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[18]  Donal O'Shea,et al.  Ideals, varieties, and algorithms - an introduction to computational algebraic geometry and commutative algebra (2. ed.) , 1997, Undergraduate texts in mathematics.

[19]  A. Rinaldo,et al.  MAXIMUM LIKELIHOOD ESTIMATION IN LOG-LINEAR , 2012 .

[20]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[21]  Stephen E. Fienberg,et al.  Three centuries of categorical data analysis: Log-linear models and maximum likelihood estimation , 2007 .

[22]  Timothy R. C. Read,et al.  Goodness-Of-Fit Statistics for Discrete Multivariate Data , 1988 .

[23]  A. Rinaldo,et al.  On the geometry of discrete exponential families with application to exponential random graph models , 2008, 0901.0026.

[24]  Jonathan J. Forster,et al.  Bayesian inference for Poisson and multinomial log-linear models , 2010 .

[25]  P. Diaconis,et al.  Algebraic algorithms for sampling from conditional distributions , 1998 .

[26]  I. Csiszár,et al.  Convex cores of measures on R d , 2001 .

[27]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[28]  C. Ireland,et al.  Analysis of frequency data. , 2010 .

[29]  Ronald Christensen,et al.  Log-Linear Models and Logistic Regression , 1997 .

[30]  S. Fienberg An Iterative Procedure for Estimation in Contingency Tables , 1970 .

[31]  H. Massam,et al.  A conjugate prior for discrete hierarchical log-linear models , 2006, 0711.1609.

[32]  J. F. C. Kingman,et al.  Information and Exponential Families in Statistical Theory , 1980 .

[33]  Joseph B. Lang,et al.  On the comparison of multinomial and Poisson log-linear models , 1996 .

[34]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .

[35]  Imre Csiszár,et al.  Generalized maximum likelihood estimates for exponential families , 2006, 2006 IEEE International Symposium on Information Theory.

[36]  A. Takemura,et al.  Some characterizations of minimal Markov basis for sampling from discrete conditional distributions , 2004 .

[37]  J. Bunker The national halothane study : a study of the possible association between halothane anesthesia and postoperative hepatic necrosis; report , 1971 .

[38]  T. Speed,et al.  Additive and Multiplicative Models and Interactions , 1983 .

[39]  G. Ziegler,et al.  Polytopes : combinatorics and computation , 2000 .

[40]  S. Sullivant,et al.  Sequential importance sampling for multiway tables , 2006, math/0605615.

[41]  O. Barndorff-Nielsen Information and Exponential Families in Statistical Theory , 1980 .

[42]  Nicholas Eriksson,et al.  Polyhedral conditions for the nonexistence of the MLE for hierarchical log-linear models , 2006, J. Symb. Comput..

[43]  D. Geiger,et al.  On the toric algebra of graphical models , 2006, math/0608054.

[44]  Alessandro Rinaldo,et al.  Computing Maximum Likelihood Estimates in Log-Linear Models , 2006 .

[45]  I. Csiszár,et al.  Generalized maximum likelihood estimates for exponential families , 2008 .

[46]  Stephen E. Fienberg,et al.  Maximum Likelihood Estimation in Network Models , 2011, ArXiv.

[47]  Radim Jirousek,et al.  Solution of the marginal problem and decomposable distributions , 1991, Kybernetika.

[48]  A. Dawid,et al.  Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory , 2004, math/0410076.

[49]  Shelby J. Haberman,et al.  Log-Linear Models and Frequency Tables with Small Expected Cell Counts , 1977 .

[50]  G. Ziegler Lectures on Polytopes , 1994 .

[51]  Thomas Brox,et al.  Maximum Likelihood Estimation , 2019, Time Series Analysis.

[52]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[53]  L. Goddard Information Theory , 1962, Nature.

[54]  P. Holland,et al.  Discrete Multivariate Analysis. , 1976 .

[55]  David A. Cox,et al.  Ideals, Varieties, and Algorithms , 1997 .

[56]  Carl N. Morris,et al.  CENTRAL LIMIT THEOREMS FOR MULTINOMIAL SUMS , 1975 .

[57]  W. Fulton Introduction to Toric Varieties. , 1993 .

[58]  S. Haberman,et al.  The analysis of frequency data , 1974 .

[59]  G. Ewald Combinatorial Convexity and Algebraic Geometry , 1996 .

[60]  Seth Sullivant,et al.  Lectures on Algebraic Statistics , 2008 .

[61]  A. Rinaldo,et al.  Algebraic Statistics and Contingency Table Problems: Log-Linear Models, Likelihood Estimation, and Disclosure Limitation , 2009 .

[62]  M. W. Birch Maximum Likelihood in Three-Way Contingency Tables , 1963 .

[63]  Francesco M. Malvestuto,et al.  An implementation of the iterative proportional fitting procedure by propagation trees , 2001 .

[64]  Joseph B. Lang,et al.  Homogeneous Linear Predictor Models for Contingency Tables , 2005 .

[65]  Charles J. Geyer,et al.  Likelihood inference in exponential families and directions of recession , 2009, 0901.0455.

[66]  A. Rinaldo,et al.  The Log-Linear Group Lasso Estimator and Its Asymptotic Properties , 2007, 0709.3526.

[67]  Alan Agresti,et al.  Categorical Data Analysis , 2003 .

[68]  Imre Csiszár,et al.  Information projections revisited , 2000, IEEE Trans. Inf. Theory.

[69]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[70]  Albert Verbeek,et al.  The compactification of generalized linear models , 1992 .

[71]  Michael I. Jordan Graphical Models , 2003 .

[72]  N. Čencov Statistical Decision Rules and Optimal Inference , 2000 .

[73]  K. Koehler Goodness-of-fit tests for log-linear models in sparse contingency tables , 1986 .

[74]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[75]  J. Brasselet Introduction to toric varieties , 2004 .

[76]  H. Massam,et al.  The mode oriented stochastic search (MOSS) algorithm for log-linear models with conjugate priors , 2010 .

[77]  M. Aickin Existence of MLEs for discrete linear exponential models , 1979 .

[78]  L. Pachter,et al.  Algebraic Statistics for Computational Biology: Preface , 2005 .

[79]  S. Fienberg,et al.  DESCRIBING DISABILITY THROUGH INDIVIDUAL-LEVEL MIXTURE MODELS FOR MULTIVARIATE BINARY DATA. , 2007, The annals of applied statistics.

[80]  David A. Cox,et al.  Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra, 3/e (Undergraduate Texts in Mathematics) , 2007 .