A solution to the problem of separation in logistic regression

The phenomenon of separation or monotone likelihood is observed in the fitting process of a logistic model if the likelihood converges while at least one parameter estimate diverges to ± infinity. Separation primarily occurs in small samples with several unbalanced and highly predictive risk factors. A procedure by Firth originally developed to reduce the bias of maximum likelihood estimates is shown to provide an ideal solution to separation. It produces finite parameter estimates by means of penalized maximum likelihood estimation. Corresponding Wald tests and confidence intervals are available but it is shown that penalized likelihood ratio tests and profile penalized likelihood confidence intervals are often preferable. The clear advantage of the procedure over previous options of analysis is impressively demonstrated by the statistical analysis of two cancer studies. Copyright © 2002 John Wiley & Sons, Ltd.

[1]  M. Schemper,et al.  The application of Firth's procedure to Cox and logistic regression , 2001 .

[2]  You-Gan Wang,et al.  Bias Reduction using Stochastic Approximation , 1998 .

[3]  Francisco Cribari-Neto,et al.  On bias reduction in exponential and non-exponential family regression models , 1998 .

[4]  J. Kolassa Infinite Parameter Estimates in Logistic Regression, with Application to Approximate Conditional Inference , 1997 .

[5]  W W Hauck,et al.  Jackknife bias reduction for polychotomous logistic regression. , 1997, Statistics in medicine.

[6]  Nitin R. Patel,et al.  Exact logistic regression: theory and examples. , 1995, Statistics in medicine.

[7]  D. Collet Modelling Survival Data in Medical Research , 2004 .

[8]  D. Firth Bias reduction of maximum likelihood estimates , 1993 .

[9]  Emmanuel Lesaffre,et al.  Collinearity in generalized linear regression , 1993 .

[10]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[11]  D. Firth Generalized Linear Models and Jeffreys Priors: An Iterative Weighted Least-Squares Approach , 1992 .

[12]  David Firth,et al.  Bias reduction, the Jeffreys prior and GLIM , 1992 .

[13]  P. McCullagh,et al.  Bias Correction in Generalized Linear Models , 1991 .

[14]  Donald B. Rubin,et al.  Multiple Imputation of Industry and Occupation Codes in Census Public-use Samples Using Bayesian Logistic Regression , 1991 .

[15]  Douglas B. Clarkson,et al.  Computing Extended Maximum Likelihood Estimates for Linear Parameter Models , 1991 .

[16]  Emmanuel Lesaffre,et al.  Partial Separation in Logistic Discrimination , 1989 .

[17]  Anastasios A. Tsiatis,et al.  Median Unbiased Estimation for Binary Data , 1989 .

[18]  M. Jacobsen Existence and unicity of MLEs in discrete exponential family distributions , 1989 .

[19]  K F Hirji,et al.  Exact inference for matched case-control studies. , 1988, Biometrics.

[20]  Nitin R. Patel,et al.  Computing Distributions for Exact Logistic Regression , 1987 .

[21]  Alan Agresti,et al.  An empirical investigation of some effects of sparseness in contingency tables , 1987 .

[22]  Thomas J. Santner,et al.  A note on A. Albert and J. A. Anderson's conditions for the existence of maximum likelihood estimates in logistic regression models , 1986 .

[23]  M. Væth,et al.  On the use of Wald's test in exponential families , 1985 .

[24]  R. Schaefer Bias correction in maximum likelihood logistic regression. , 1985, Statistics in medicine.

[25]  A. Albert,et al.  On the existence of maximum likelihood estimates in logistic regression models , 1984 .

[26]  R. Snee Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1983 .

[27]  J. Anderson,et al.  Penalized maximum likelihood estimation in logistic regression and discrimination , 1982 .

[28]  Mark E. Johnson,et al.  The Incidence of Monotone Likelihood in the Cox Model , 1981 .

[29]  N. Breslow,et al.  Statistical methods in cancer research: volume 1- The analysis of case-control studies , 1980 .

[30]  W. W. Muir,et al.  Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1980 .

[31]  W. Hauck,et al.  Wald's Test as Applied to Hypotheses in Logit Analysis , 1977 .