A Fresh Look at the Discriminant Function Approach for Estimating Crude or Adjusted Odds Ratios

Assuming a binary outcome, logistic regression is the most common approach to estimating a crude or adjusted odds ratio corresponding to a continuous predictor. We revisit a method termed the discriminant function approach, which leads to closed-form estimators and corresponding standard errors. In its most appealing application, we show that the approach suggests a multiple linear regression of the continuous predictor of interest on the outcome and other covariates, in place of the traditional logistic regression model. If standard diagnostics support the assumptions (including normality of errors) accompanying this linear regression model, the resulting estimator has demonstrable advantages over the usual maximum likelihood estimator via logistic regression. These include improvements in terms of bias and efficiency based on a minimum variance unbiased estimator of the log odds ratio, as well as the availability of an estimate when logistic regression fails to converge due to a separation of data points. Use of the discriminant function approach as described here for multivariable analysis requires less stringent assumptions than those for which it was historically criticized, and is worth considering when the adjusted odds ratio associated with a particular continuous predictor is of primary interest. Simulation and case studies illustrate these points.

[1]  T. Hastie,et al.  Generalized Additive Models , 2014 .

[2]  Sander Greenland,et al.  Case–Control Studies , 2008 .

[3]  Rodney X. Sturdivant,et al.  Applied Logistic Regression: Hosmer/Applied Logistic Regression , 2005 .

[4]  Micah Altman,et al.  Convergence Problems in Logistic Regression , 2004 .

[5]  Alan Agresti,et al.  Categorical Data Analysis , 2003 .

[6]  M. Schemper,et al.  A solution to the problem of separation in logistic regression , 2002, Statistics in medicine.

[7]  D. Firth Bias reduction of maximum likelihood estimates , 1993 .

[8]  B. Efron The Efficiency of Logistic Regression Compared to Normal Discriminant Analysis , 1975 .

[9]  J. A. Anderson,et al.  Quadratic logistic discrimination , 1975 .

[10]  V. Barnett,et al.  Applied Linear Statistical Models , 1975 .

[11]  P. Lachenbruch Mathematical Statistics, 2nd Edition , 1972 .

[12]  M. Halperin,et al.  Estimation of the multivariate logistic risk function: a comparison of the discriminant function and maximum likelihood approaches. , 1971, Journal of chronic diseases.

[13]  D. Cox,et al.  The analysis of binary data , 1971 .

[14]  J. Cornfield,et al.  A multivariate analysis of the risk of coronary heart disease in Framingham. , 1967, Journal of chronic diseases.

[15]  J. Cornfield Joint dependence of risk of coronary heart disease on serum cholesterol and systolic blood pressure: a discriminant function analysis. , 1962, Federation proceedings.

[16]  T. W. Anderson,et al.  An Introduction to Multivariate Statistical Analysis , 1959 .

[17]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[18]  Arno W. Hoes,et al.  Case-control studies. , 1995, The Netherlands journal of medicine.

[19]  S Greenland,et al.  Quantitative methods in the review of epidemiologic literature. , 1987, Epidemiologic reviews.