Estimating cell probabilities in contingency tables with constraints on marginals/conditionals by geometric programming with applications

Contingency tables are often used to display the multivariate frequency distribution of variables of interest. Under the common multinomial assumption, the first step of contingency table analysis is to estimate cell probabilities. It is well known that the unconstrained maximum likelihood estimator (MLE) is given by cell counts divided by the total number of observations. However, in the presence of (complex) constraints on the unknown cell probabilities or their functions, the MLE or other types of estimators may often have no closed form and have to be obtained numerically. In this paper, we focus on finding the MLE of cell probabilities in contingency tables under two common types of constraints: known marginals and ordered marginals/conditionals, and propose a novel approach based on geometric programming. We present two important applications that illustrate the usefulness of our approach via comparison with existing methods. Further, we show that our GP-based approach is flexible, readily implementable, effort-saving and can provide a unified framework for various types of constrained estimation of cell probabilities in contingency tables.

[1]  T. R. Jefferson,et al.  Maximum likelihood estimates for multinomial probabilities via geometric programming , 1983 .

[2]  Chunming Zhang,et al.  Ranked Set Sampling: Theory and Applications , 2005, Technometrics.

[3]  Dale Belman,et al.  THE REVIEW OF ECONOMICS AND STATISTICS SHEEPSKIN EFFECTS IN THE RETURNS TO EDUCATION: AN EXAMINATION OF WOMEN AND MINORITIES , 1991 .

[4]  Jesse Frey,et al.  Nonparametric Tests for Perfect Judgment Rankings , 2007 .

[5]  Clarence Zener,et al.  Geometric Programming : Theory and Application , 1967 .

[6]  F. T. Wright Order-Restricted Inferences , 2006 .

[7]  Jesse Frey,et al.  Constrained estimation using judgment post-stratification , 2011 .

[8]  Xinlei Wang,et al.  Maximum likelihood estimation of ordered multinomial probabilities by geometric programming , 2009, Comput. Stat. Data Anal..

[9]  W. Deming,et al.  On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known , 1940 .

[10]  Sharon L. Lohr,et al.  Sampling: Design and Analysis , 1999 .

[11]  Irving John Good,et al.  The Estimation of Probabilities: An Essay on Modern Bayesian Methods , 1965 .

[12]  J Moreh Human capital and economic growth - United Kingdom, 1951-1961 , 1971 .

[13]  Joong-Ho Won,et al.  ROC convex hull and nonparametric maximum likelihood estimation , 2012, Machine Learning.

[14]  Peter F. Orazem,et al.  Winners and Losers in Transition: Returns to Education, Experience, and Gender in Slovenia , 1994 .

[15]  Stephen P. Boyd,et al.  A tutorial on geometric programming , 2007, Optimization and Engineering.

[16]  Johan Lim,et al.  Estimating Stochastically Ordered Survival Functions via Geometric Programming , 2009 .

[17]  Nicholas P Jewell,et al.  Maximum likelihood estimation of ordered multinomial parameters. , 2004, Biostatistics.

[18]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[19]  Douglas A Wolfe,et al.  Judgement Post‐Stratification with Imprecise Rankings , 2004, Biometrics.

[20]  J. M. Bremner,et al.  Statistical Inference under Restrictions , 1973 .

[21]  S. Kullback,et al.  Contingency tables with given marginals. , 1968, Biometrika.

[22]  S. Fienberg,et al.  Incomplete two-dimensional contingency tables. , 1969, Biometrics.

[23]  I. Good A Bayesian Significance Test for Multinomial Distributions , 1967 .

[24]  Stephen P. Boyd,et al.  Digital Circuit Optimization via Geometric Programming , 2005, Oper. Res..

[25]  Kenneth O. Kortanek,et al.  Maximum likelihood estimates with order restrictions on probabilities and odds ratios: A geometric programming approach , 1997, Adv. Decis. Sci..

[26]  Wolfgang Pelz,et al.  Estimating Probabilities from Contingency Tables When the Marginal Probabilities are Known, by Using Additive Objective Functions , 1986 .

[27]  Steven J. Haider Earnings Instability and Earnings Inequality of Males in the United States: 1967–1991 , 2001, Journal of Labor Economics.

[28]  J. Richard Alldredge,et al.  Maximum Likelihood Estimation for the Multinomial Distribution Using Geometric Progamming , 1974 .

[29]  K. Arrow Information and Economic Behavior , 1973 .

[30]  G. McIntyre,et al.  A method for unbiased selective sampling, using ranked sets , 1952 .

[31]  Lynne Stokes,et al.  Concomitants of Multivariate Order Statistics With Application to Judgment Poststratification , 2006 .

[32]  Johan Lim,et al.  Generalized Isotonized Mean Estimators for Judgment Post-stratification with Multiple Rankers , 2014 .

[33]  F. F. Stephan An Iterative Method of Adjusting Sample Frequency Tables When Expected Marginal Totals are Known , 1942 .

[34]  Peter M. Blau,et al.  The American Occupational Structure , 1967 .

[35]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[36]  John H. Thompson 1981: CONVERGENCE PROPERTIES OF THE ITERATIVE 1980 CENSUS ESTIMATOR , 2002 .

[37]  Johan Lim,et al.  Isotonized CDF estimation from judgment poststratification data with empty strata. , 2012, Biometrics.

[38]  R. Little,et al.  Models for Contingency Tables with Known Margins when Target and Sampled Populations Differ , 1991 .