A Solution to Separation in Binary Response Models

A common problem in models for dichotomous dependent variables is “separation,” which occurs when one or more of a model's covariates perfectly predict some binary outcome. Separation raises a particularly difficult set of issues, often forcing researchers to choose between omitting clearly important covariates and undertaking post—hoc data or estimation corrections. In this article I present a method for solving the separation problem, based on a penalized likelihood correction to the standard binomial GLM score function. I then apply this method to data from an important study on the postwar fate of leaders.

[1]  S. Sarna,et al.  [Regression models]. , 1988, Duodecim; laaketieteellinen aikakauskirja.

[2]  M. Jacobsen Existence and unicity of MLEs in discrete exponential family distributions , 1989 .

[3]  J. MacKinnon,et al.  Estimation and inference in econometrics , 1994 .

[4]  Jeff Gill,et al.  Bayesian Methods : A Social and Behavioral Sciences Approach , 2002 .

[5]  Geoffrey D. Peterson,et al.  Expressions of Distrust: Third-Party Voting and Cynicism in Government , 1998 .

[6]  David R. Cox The analysis of binary data , 1970 .

[7]  Joel B. Greenhouse,et al.  Selection Models and the File Drawer Problem , 1988 .

[8]  Purushottam W. Laud,et al.  On Bayesian Analysis of Generalized Linear Models Using Jeffreys's Prior , 1991 .

[9]  Thomas Rotolo,et al.  A Time to Join, A Time to Quit: The Influence of Life Cycle Transitions on Voluntary Association Membership , 2000 .

[10]  John A. Vasquez,et al.  Uncovering the Dangerous Alliances, 1495–1980 , 1998 .

[11]  Jeroen K. Vermunt,et al.  Bayesian Posterior Estimation of Logit Parameters with Small Samples , 2004 .

[12]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[13]  Brian D. Ripley,et al.  Modern applied statistics with S, 4th Edition , 2002, Statistics and computing.

[14]  R. Kass The Geometry of Asymptotic Inference , 1989 .

[15]  S. Werner Absolute and limited war: The possibility of foreign‐imposed regime change , 1996 .

[16]  Nitin R. Patel,et al.  Exact logistic regression: theory and examples. , 1995, Statistics in medicine.

[17]  R. W. Wedderburn,et al.  On the existence and uniqueness of the maximum likelihood estimates for certain generalized linear models , 1976 .

[18]  Dan Reiter Military Strategy and the Outbreak of International Conflict , 1999 .

[19]  Dale J. Poirier,et al.  Jeffreys' prior for logit models , 1994 .

[20]  M. Wells,et al.  The Predictability of Punitive Damages , 1997, The Journal of Legal Studies.

[21]  D. Firth Bias reduction of maximum likelihood estimates , 1993 .

[22]  J. H. Schuenemeyer,et al.  Generalized Linear Models (2nd ed.) , 1992 .

[23]  M. Silvapulle On the Existence of Maximum Likelihood Estimators for the Binomial Response Models , 1981 .

[24]  Georg Heinze,et al.  Fixing the nonconvergence bug in logistic regression with SPLUS and SAS , 2003, Comput. Methods Programs Biomed..

[25]  J. A. Calvin Regression Models for Categorical and Limited Dependent Variables , 1998 .

[26]  Erik Wibbels Federalism and the Politics of Macroeconomic Policy and Performance , 2000 .

[27]  Mark E. Hill,et al.  Color Differences in the Socioeconomic Status of African American Men: Results of a Longitudinal Study , 2000 .

[28]  P. Menchik,et al.  Wealth Mobility , 1997, Review of Economics and Statistics.

[29]  H. Jeffreys An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[30]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[31]  B. Haldane THE ESTIMATION AND SIGNIFICANCE OF THE LOGARITHM OF A RATIO OF FREQUENCIES , 1956, Annals of human genetics.

[32]  M. Schemper,et al.  A solution to the problem of separation in logistic regression , 2002, Statistics in medicine.

[33]  D. Cox,et al.  Inference and Asymptotics , 1994 .

[34]  Lisa J. Cameron Limiting Buyer Discretion: Effects on Performance and Price in Long-Term Contracts , 2000 .

[35]  L. Brown,et al.  Interval Estimation for a Binomial Proportion , 2001 .

[36]  Henk E Goemans Fighting for Survival , 2000 .

[37]  John A. Vasquez,et al.  Uncovering the Dangerous Alliances , 1998 .

[38]  D. Collett,et al.  Modeling Binary Data. , 1993 .

[39]  T. P. Ryan,et al.  A Preliminary Investigation of Maximum Likelihood Logistic Regression versus Exact Logistic Regression , 2002 .

[40]  Emmanuel Lesaffre,et al.  Partial Separation in Logistic Discrimination , 1989 .

[41]  John H. Aldrich,et al.  Linear probability, logit and probit models , 1984 .

[42]  Donald B. Rubin,et al.  Multiple Imputation of Industry and Occupation Codes in Census Public-use Samples Using Bayesian Logistic Regression , 1991 .

[43]  A. Albert,et al.  On the existence of maximum likelihood estimates in logistic regression models , 1984 .