Small sample estimation of log odds ratios from logistic regression and fourfold tables.

Schaefer has proposed a method to correct the maximum likelihood logistic regression coefficients for bias in small samples. We show here that this reduces, in the special case of a single dichotomous regression variable, to an earlier result of Haldane, for the estimation of a single log odds ratio. This paper reviews various estimators for the log odds ratio in a fourfold table, and compares the properties of two by complete enumeration in sets of tables with small sample sizes. The first of these, also due to Haldane, is based on the addition of 1/2 to each cell of the table; the second is where 1/2 is added to all cells of the table only if a zero frequency arises. We evaluate the bias and mean squared error of both of these estimators in sets of tables with various sample sizes and odds ratios. Haldane's estimator usually has lower bias and MSE, and so we do not in general recommend the practice of adding 1/2 only as necessary. Exceptions might occur if one has good a priori estimates of the outcome probabilities in the two samples under comparison.