Separation in Logistic Regression: Causes, Consequences, and Control.

Separation is encountered in regression models with a discrete outcome (such as logistic regression) where the covariates perfectly predict the outcome. It is most frequent under the same conditions that lead to small-sample and sparse-data bias, such as presence of a rare outcome, rare exposures, highly correlated covariates, or covariates with strong effects. In theory, separation will produce infinite estimates for some coefficients. In practice, however, separation may be unnoticed or mishandled because of software limits in recognizing and handling the problem and in notifying the user. We discuss causes of separation in logistic regression and describe how common software packages deal with it. We then describe methods that remove separation, focusing on the same penalized-likelihood techniques used to address more general sparse-data problems. These methods improve accuracy, avoid software problems, and allow interpretation as Bayesian analyses with weakly informative priors. We discuss likelihood penalties, including some that can be implemented easily with any software package, and their relative advantages and disadvantages. We provide an illustration of ideas and methods using data from a case-control study of contraceptive practices and urinary tract infection.

[1]  G. Heinze,et al.  Concomitant Endothelin‐1 Overexpression in Lung Transplant Donors and Recipients Predicts Primary Graft Dysfunction , 2010, American journal of transplantation : official journal of the American Society of Transplantation and the American Society of Transplant Surgeons.

[2]  Sander Greenland,et al.  Sparse data bias: a problem hiding in plain sight , 2016, British Medical Journal.

[3]  Georg Heinze,et al.  Firth's logistic regression with rare events: accurate effect estimates and predictions? , 2017, Statistics in medicine.

[4]  Georg Heinze,et al.  A comparative investigation of methods for logistic regression with separated or nearly separated data , 2006, Statistics in medicine.

[5]  Georg Heinze Comment on 'Bias reduction in conditional logistic regression' by J. X. Sun, S. Sinha, S. Wang and T. Maiti, Statistics in Medicine 2010; DOI: 10.1002/sim.4105. , 2011, Statistics in medicine.

[6]  Douglas M Potter,et al.  A permutation test for inference in logistic regression with small‐ and moderate‐sized data sets , 2005, Statistics in medicine.

[7]  Micah Altman,et al.  Numerical Issues in Statistical Computing for the Social Scientist , 2003 .

[8]  H. Morgenstern,et al.  Standardized regression coefficients: a further critique and review of some alternatives. , 1991, Epidemiology.

[9]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[10]  S Greenland,et al.  Problems due to small samples and sparse data in conditional logistic regression analysis. , 2000, American journal of epidemiology.

[11]  Micah Altman,et al.  Convergence Problems in Logistic Regression , 2004 .

[12]  David Firth,et al.  Bias reduction in exponential family nonlinear models , 2009 .

[13]  Galit Shmueli,et al.  To Explain or To Predict? , 2010 .

[14]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[15]  M. Væth,et al.  On the use of Wald's test in exponential families , 1985 .

[16]  M. Schemper,et al.  A solution to the problem of separation in logistic regression , 2002, Statistics in medicine.

[17]  Francisco Cribari-Neto,et al.  On bias reduction in exponential and non-exponential family regression models , 1998 .

[18]  J. Koopman,et al.  Condom Use and First‐Time Urinary Tract Infection , 1997, Epidemiology.

[19]  E. Steyerberg Clinical Prediction Models , 2008, Statistics for Biology and Health.

[20]  Silvana Tenreyro,et al.  Poisson: Some Convergence Issues , 2011 .

[21]  Sander Greenland,et al.  Penalization, bias reduction, and default priors in logistic and related categorical and survival regressions , 2015, Statistics in medicine.

[22]  D. Firth Bias reduction of maximum likelihood estimates , 1993 .

[23]  Sander Greenland,et al.  Approximate Bayesian Logistic Regression via Penalized Likelihood by Data Augmentation , 2015 .

[24]  Tapabrata Maiti,et al.  A comparative study of the bias corrected estimates in logistic regression , 2008, Statistical methods in medical research.

[25]  Sander Greenland,et al.  Bayesian regression in SAS software. , 2013, International journal of epidemiology.

[26]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[27]  Andrew B. Lawson,et al.  Bayesian Biostatistics: Lesaffre/Bayesian Biostatistics , 2012 .

[28]  A. Albert,et al.  On the existence of maximum likelihood estimates in logistic regression models , 1984 .

[29]  Although we appreciate the authors' efforts in conducting their comparative study, we disagree with some of the conclusions drawn. , 2012, Statistical methods in medical research.

[30]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[31]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[32]  Sander Greenland,et al.  Bayesian perspectives for epidemiological research. II. Regression analysis. , 2007, International journal of epidemiology.

[33]  Sander Greenland,et al.  Maximum likelihood, profile likelihood, and penalized likelihood: a primer. , 2014, American journal of epidemiology.

[34]  Sander Greenland,et al.  Generalized Conjugate Priors for Bayesian Analysis of Risk and Survival Regressions , 2003, Biometrics.

[35]  M Schemper,et al.  A Solution to the Problem of Monotone Likelihood in Cox Regression , 2001, Biometrics.

[36]  Georg Heinze,et al.  Bias‐reduced and separation‐proof conditional logistic regression with small or sparse data sets , 2010, Statistics in medicine.

[37]  Anastasios A. Tsiatis,et al.  Median Unbiased Estimation for Binary Data , 1989 .

[38]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[39]  S Greenland,et al.  The fallacy of employing standardized regression coefficients and correlations as measures of effect. , 1986, American journal of epidemiology.

[40]  R. Schaefer Bias correction in maximum likelihood logistic regression. , 1985, Statistics in medicine.

[41]  M. S. Rahman,et al.  Performance of Firth-and logF-type penalized methods in risk prediction for small or sparse binary data , 2017, BMC Medical Research Methodology.

[42]  M. G. Pittau,et al.  A weakly informative default prior distribution for logistic and other regression models , 2008, 0901.4011.

[43]  W W Hauck,et al.  Jackknife bias reduction for polychotomous logistic regression. , 1997, Statistics in medicine.