Robustness of ordinary least squares in randomized clinical trials

There has been a series of occasional papers in this journal about semiparametric methods for robust covariate control in the analysis of clinical trials. These methods are fairly easy to apply on currently available computers, but standard software packages do not yet support these methods with easy option selections. Moreover, these methods can be difficult to explain to practitioners who have only a basic statistical education. There is also a somewhat neglected history demonstrating that ordinary least squares (OLS) is very robust to the types of outcome distribution features that have motivated the newer methods for robust covariate control. We review these two strands of literature and report on some new simulations that demonstrate the robustness of OLS to more extreme normality violations than previously explored. The new simulations involve two strongly leptokurtic outcomes: near-zero binary outcomes and zero-inflated gamma outcomes. Potential examples of such outcomes include, respectively, 5-year survival rates for stage IV cancer and healthcare claim amounts for rare conditions. We find that traditional OLS methods work very well down to very small sample sizes for such outcomes. Under some circumstances, OLS with robust standard errors work well with even smaller sample sizes. Given this literature review and our new simulations, we think that most researchers may comfortably continue using standard OLS software, preferably with the robust standard errors.

[1]  John M Colford,et al.  Performance of analytical methods for overdispersed counts in cluster randomized trials: Sample size, degree of clustering and imbalance , 2009, Statistics in medicine.

[2]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[3]  Adjusting for covariates in zero-inflated gamma and zero-inflated log-normal models for semicontinuous data , 2013 .

[4]  P. Rosenbaum Covariance Adjustment in Randomized Experiments and Observational Studies , 2002 .

[5]  R J Carroll,et al.  On design considerations and randomization-based inference for community intervention trials. , 1996, Statistics in medicine.

[6]  G. Forrester,et al.  Robustness of the t and U tests under combined assumption violations , 1998 .

[7]  Luigi Salmaso,et al.  Permutation Tests for Complex Data , 2010 .

[8]  R. D'Agostino,et al.  Robustness and power of analysis of covariance applied to ordinal scaled data as arising in randomized controlled trials , 2003, Statistics in Medicine.

[9]  Y. Cheung A modified least-squares regression approach to the estimation of risk difference. , 2007, American journal of epidemiology.

[10]  Discussion of ‘Parametric versus nonparametrics: two alternative methodologies’ , 2009 .

[11]  G G Koch,et al.  Issues for covariance analysis of dichotomous and ordered categorical data from randomized clinical trials and non-parametric strategies for addressing them. , 1998, Statistics in medicine.

[12]  Lisa M LaVange,et al.  Randomization-based nonparametric methods for the analysis of multicentre trials , 2005, Statistical methods in medical research.

[13]  W. Lin,et al.  Agnostic notes on regression adjustments to experimental data: Reexamining Freedman's critique , 2012, 1208.2301.

[14]  Xuefeng Li,et al.  Confidence intervals for two sample binomial distribution , 2005 .

[15]  E. Lesaffre,et al.  A note on non‐parametric ANCOVA for covariate adjustment in randomized clinical trials , 2003, Statistics in medicine.

[16]  J. S. Long,et al.  Using Heteroscedasticity Consistent Standard Errors in the Linear Regression Model , 2000 .

[17]  N. Duan,et al.  Applying permutation tests with adjustment for covariates and attrition weights to randomized trials of health‐services interventions , 2009, Statistics in medicine.

[18]  S. L. Andersen,et al.  Permutation Theory in the Derivation of Robust Criteria and the Study of Departures from Assumption , 1955 .

[19]  M. Davidian,et al.  Covariate adjustment for two‐sample treatment comparisons in randomized clinical trials: A principled yet flexible approach , 2008, Statistics in medicine.

[20]  T. Lumley,et al.  The importance of the normality assumption in large public health data sets. , 2002, Annual review of public health.