Simple randomization did not protect against bias in smaller trials.

OBJECTIVES By removing systematic differences across treatment groups, simple randomization is assumed to protect against bias. However, random differences may remain if the sample size is insufficiently large. We sought to determine the minimal sample size required to eliminate random differences, thereby allowing an unbiased estimation of the treatment effect. STUDY DESIGN AND SETTING We reanalyzed two published multicenter, large, and simple trials: the International Stroke Trial (IST) and the Coronary Artery Bypass Grafting (CABG) Off- or On-Pump Revascularization Study (CORONARY). We reiterated 1,000 times the analysis originally reported by the investigators in random samples of varying size. We measured the covariates balance across the treatment arms. We estimated the effect of aspirin and heparin on death or dependency at 30 days after stroke (IST), and the effect of off-pump CABG on a composite primary outcome of death, nonfatal stroke, nonfatal myocardial infarction, or new renal failure requiring dialysis at 30 days (CORONARY). In addition, we conducted a series of Monte Carlo simulations of randomized trials to supplement these analyses. RESULTS Randomization removes random differences between treatment groups when including at least 1,000 participants, thereby resulting in minimal bias in effects estimation. Later, substantial bias is observed. In a short review, we show such an enrollment is achieved in 41.5% of phase 3 trials published in the highest impact medical journals. CONCLUSIONS Conclusions drawn from completely randomized trials enrolling a few participants may not be reliable. In these circumstances, alternatives such as minimization or blocking should be considered for allocating the treatment.

[1]  L. Messer,et al.  Reporting Discrepancies Between the ClinicalTrials . gov , 2014 .

[2]  R. Peto,et al.  Clinical trial methodology , 1978, Nature.

[3]  G. Guyatt,et al.  Association between postoperative troponin levels and 30-day mortality among patients undergoing noncardiac surgery. , 2012, JAMA.

[4]  S Greenland,et al.  Interpretation and choice of effect measures in epidemiologic analyses. , 1987, American journal of epidemiology.

[5]  Jeroen J. Bax,et al.  Fluvastatin and perioperative events in patients undergoing vascular surgery. , 2009, The New England journal of medicine.

[6]  S. Assmann,et al.  Subgroup analysis and other (mis)uses of baseline data in clinical trials , 2000, The Lancet.

[7]  Philippe Ravaud,et al.  Quality of reporting of noninferiority and equivalence randomized trials. , 2006, JAMA.

[8]  S. Pocock,et al.  Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. , 1975, Biometrics.

[9]  P. Sandercock,et al.  The International Stroke Trial database , 2011, Trials.

[10]  Gary King,et al.  The essential role of pair matching in cluster-randomized experiments, with application to the Mexican Universal Health Insurance Evaluation , 2009, 0910.3752.

[11]  S. Pocock,et al.  Design of Major Randomized Trials: Part 3 of a 4-Part Series on Statistics for Clinical Trials. , 2015, Journal of the American College of Cardiology.

[12]  Caroline Leigh Watkins The International Stroke Trial (IST): a randomised trial of aspirin, subcutaneous heparin, both, or neither among 19 435 patients with acute ischaemic stroke , 1997 .

[13]  Gary King,et al.  Misunderstandings between experimentalists and observationalists about causal inference , 2008 .

[14]  Douglas G Altman,et al.  Treatment allocation in controlled trials: why randomise? , 1999, BMJ.

[15]  R Peto,et al.  Why do we need some large, simple randomized trials? , 1984, Statistics in medicine.

[16]  D. G. Altman,et al.  Randomisation and baseline comparisons in clinical trials , 1990, The Lancet.

[17]  E. Akl,et al.  Randomisation to protect against selection bias in healthcare trials. , 2011, The Cochrane database of systematic reviews.

[18]  S. Pocock,et al.  Estimation issues in clinical trials and overviews. , 1990, Statistics in medicine.

[19]  A. Rigby Cross‐over Trials in Clinical Research , 2003 .

[20]  S. Pocock,et al.  Randomized trials, statistics, and clinical inference. , 2010, Journal of the American College of Cardiology.

[21]  Douglas G Altman,et al.  Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. , 2010, JAMA.

[22]  D. Harrison,et al.  Effect of a perioperative, cardiac output-guided hemodynamic therapy algorithm on outcomes following major gastrointestinal surgery: a randomized clinical trial and systematic review. , 2014, JAMA.

[23]  Denis Xavier,et al.  Effects of extended-release metoprolol succinate in patients undergoing non-cardiac surgery (POISE trial): a randomised controlled trial , 2008, The Lancet.

[24]  H. Krumholz,et al.  Reporting of results in ClinicalTrials.gov and high-impact journals. , 2014, JAMA.

[25]  S. Senn Testing for baseline balance in clinical trials. , 1994, Statistics in medicine.

[26]  Peter Sandercock,et al.  The International Stroke Trial (IST): a randomised trial of aspirin, subcutaneous heparin, both, or neither among 19 435 patients with acute ischaemic stroke , 1997, The Lancet.

[27]  Gordon H Guyatt,et al.  GrADe : what is “ quality of evidence ” and why is it important to clinicians ? rATING quALITY of evIDeNCe AND STreNGTH of reCommeNDATIoNS , 2022 .

[28]  Robert M Califf,et al.  Characteristics of clinical trials registered in ClinicalTrials.gov, 2007-2010. , 2012, JAMA.

[29]  S. Pocock,et al.  Subgroup analysis, covariate adjustment and baseline comparisons in clinical trial reporting: current practiceand problems , 2002, Statistics in medicine.

[30]  B. Biccard,et al.  Vascular events in noncardiac Surgery patients cohortevaluation study (Vision) Nt-pro-BNP sub study. A large multicentre international cohort study evaluating NT-pro-BNP in major vascular events in patients undergonig noncardiac surgery , 2011 .

[31]  Gary King,et al.  Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference , 2007, Political Analysis.

[32]  J. Bland,et al.  The tyranny of power: is there a better way to calculate sample size? , 2009, BMJ : British Medical Journal.

[33]  Donald B. Rubin,et al.  Comment: The Design and Analysis of Gold Standard Randomized Experiments , 2008 .

[34]  E F Cook,et al.  Derivation and prospective validation of a simple index for prediction of cardiac risk of major noncardiac surgery. , 1999, Circulation.

[35]  Richard A. Nielsen,et al.  Why Propensity Scores Should Not Be Used for Matching , 2019, Political Analysis.

[36]  M. Gail,et al.  Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates , 1984 .

[37]  J. Lachin Properties of simple randomization in clinical trials. , 1988, Controlled clinical trials.

[38]  David Moher,et al.  Comparison of registered and published primary outcomes in randomized controlled trials. , 2009, JAMA.

[39]  S. Yusuf,et al.  Myocardial Injury after Noncardiac Surgery: A Large, International, Prospective Cohort Study Establishing Diagnostic Criteria, Characteristics, Predictors, and 30-day Outcomes , 2014, Anesthesiology.

[40]  T. Treasure,et al.  Minimisation: the platinum standard for trials? , 1998, BMJ.

[41]  Jeffrey H Silber,et al.  Optimal multivariate matching before randomization. , 2004, Biostatistics.

[42]  S J Senn,et al.  Covariate imbalance and random allocation in clinical trials. , 1989, Statistics in medicine.

[43]  F. Song,et al.  Dissemination and publication of research findings: an updated review of related biases. , 2010, Health technology assessment.

[44]  Sally Hopewell,et al.  The quality of reports of randomised trials in 2000 and 2006: comparative study of articles indexed in PubMed , 2010, BMJ : British Medical Journal.

[45]  Douglas G. Altman,et al.  RANDOMISATION : ESSENTIAL FOR REDUCING BIAS , 1991 .