Testing treatment effects in unconfounded studies under model misspecification: Logistic regression, discretization, and their combination

Logistic regression is commonly used to test for treatment effects in observational studies. If the distribution of a continuous covariate differs between treated and control populations, logistic regression yields an invalid hypothesis test even in an uncounfounded study if the link is not logistic. This flaw is not corrected by the commonly used technique of discretizing the covariate into intervals. A valid test can be obtained by discretization followed by regression adjustment within each interval. Copyright © 2009 John Wiley & Sons, Ltd.

[1]  D. Dey,et al.  Flexible generalized t-link models for binary response data , 2008 .

[2]  Charles W Hoge,et al.  Mild traumatic brain injury in U.S. Soldiers returning from Iraq. , 2008, The New England journal of medicine.

[3]  N. Uldbjerg,et al.  Risk of respiratory morbidity in term infants delivered by elective caesarean section: cohort study , 2007, BMJ : British Medical Journal.

[4]  Fang-Shu Ou,et al.  Pay for performance, quality of care, and outcomes in acute myocardial infarction. , 2007, JAMA.

[5]  John M. Davis,et al.  Maternal seafood consumption in pregnancy and neurodevelopmental outcomes in childhood (ALSPAC study): an observational cohort study , 2007, The Lancet.

[6]  L. Aiken,et al.  Hospital nurse staffing and patient mortality, nurse burnout, and job dissatisfaction. , 2002, JAMA.

[7]  Hea-Jung Kim BINARY REGRESSION WITH A CLASS OF SKEWED t LINK MODELS , 2002 .

[8]  B. Hanusa,et al.  Patient satisfaction in resident and attending ambulatory care clinics , 2001, Journal of General Internal Medicine.

[9]  S. Greenland,et al.  Factoring vs Linear Modeling in Rate Estimation: A Simulation Study of Relative Accuracy , 1998, Epidemiology.

[10]  D. Hosmer,et al.  A comparison of goodness-of-fit tests for the logistic regression model. , 1997, Statistics in medicine.

[11]  S G Thompson,et al.  Methods for summarizing the risk associations of quantitative variables in epidemiologic studies in a consistent form. , 1996, American journal of epidemiology.

[12]  S. Greenland,et al.  A Comparison of the Performance of Model‐Based Confidence Intervals When the Correct Model Form Is Unknown: Coverage of Asymptotic Means , 1994, Epidemiology.

[13]  Stephen W. Lagakos,et al.  Effects of Mismodeling on Tests of Association Based on Logistic Regression Models , 1992 .

[14]  Claudia Czado,et al.  The effect of link misspecification on binary regression inference , 1992 .

[15]  Therese A. Stukel,et al.  Generalized logistic models , 1988 .

[16]  D. Rubin,et al.  Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[17]  D. Pregibon,et al.  Graphical Methods for Assessing Logistic Regression Models , 1984 .

[18]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[19]  Víctor M. Guerrero,et al.  Use of the Box-Cox transformation with binary response models , 1982 .

[20]  Francisco J. Aranda-Ordaz,et al.  On Two Families of Transformations to Additivity for Binary Response Data , 1981 .

[21]  D. Pregibon Goodness of Link Tests for Generalized Linear Models , 1980 .

[22]  D. Rubin,et al.  Using Multivariate Matched Sampling and Regression Adjustment to Control Bias in Observational Studies , 1978 .

[23]  R. Prentice,et al.  A generalization of the probit and logit methods for dose response curves. , 1976, Biometrics.

[24]  W. G. Cochran,et al.  Controlling Bias in Observational Studies: A Review. , 1974 .

[25]  D. Rubin Matched Sampling for Causal Effects: The Use of Matched Sampling and Regression Adjustment to Remove Bias in Observational Studies , 1973 .

[26]  D. Rubin Matched Sampling for Causal Effects: Matching to Remove Bias in Observational Studies , 1973 .

[27]  W. G. Cochran The effectiveness of adjustment by subclassification in removing bias in observational studies. , 1968, Biometrics.

[28]  W. G. Cochran The Planning of Observational Studies of Human Populations , 1965 .

[29]  I. W. Burr Cumulative Frequency Functions , 1942 .

[30]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[31]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .