An introduction to multiplicity issues in clinical trials: the what, why, when and how.

In clinical trials it is not uncommon to face a multiple testing problem which can have an impact on both type I and type II error rates, leading to inappropriate interpretation of trial results. Multiplicity issues may need to be considered at the design, analysis and interpretation stages of a trial. The proportion of trial reports not adequately correcting for multiple testing remains substantial. The purpose of this article is to provide an introduction to multiple testing issues in clinical trials, and to reduce confusion around the need for multiplicity adjustments. We use a tutorial, question-and-answer approach to address the key issues of why, when and how to consider multiplicity adjustments in trials. We summarize the relevant circumstances under which multiplicity adjustments ought to be considered, as well as options for carrying out multiplicity adjustments in terms of trial design factors including Population, Intervention/Comparison, Outcome, Time frame and Analysis (PICOTA). Results are presented in an easy-to-use table and flow diagrams. Confusion about multiplicity issues can be reduced or avoided by considering the potential impact of multiplicity on type I and II errors and, if necessary pre-specifying statistical approaches to either avoid or adjust for multiplicity in the trial protocol or analysis plan.

[1]  M A Waclawiw,et al.  Practical guidelines for multiplicity adjustment in clinical trials. , 2000, Controlled clinical trials.

[2]  J. Haybittle,et al.  Repeated assessment of results in clinical trials of cancer treatment. , 1971, The British journal of radiology.

[3]  T R Fleming,et al.  Designs for group sequential tests. , 1984, Controlled clinical trials.

[4]  Kenneth F Schulz,et al.  Multiplicity in randomised trials I: endpoints and treatments , 2005, The Lancet.

[5]  J. Wittes,et al.  Analysis and interpretation of treatment effects in subgroups of patients in randomized clinical trials. , 1991, JAMA.

[6]  D. Cook,et al.  Risk factors and impact of major bleeding in critically ill patients receiving heparin thromboprophylaxis , 2013, Intensive Care Medicine.

[7]  C. Mehta,et al.  The future of drug development: advancing clinical trial design , 2009, Nature Reviews Drug Discovery.

[8]  K. Schulz,et al.  Multiplicity in randomised trials II: subgroup and interim analyses , 2005, The Lancet.

[9]  P. Westfall,et al.  Multiple comparisons and multiple tests using SAS , 2011 .

[10]  P. Armstrong,et al.  Applying novel methods to assess clinical outcomes: insights from the TRILOGY ACS trial. , 2015, European heart journal.

[11]  L. Lazzeroni,et al.  The cost of large numbers of hypothesis tests on power, effect size and sample size , 2012, Molecular Psychiatry.

[12]  S. Pocock,et al.  Challenging Issues in Clinical Trial Design: Part 4 of a 4-Part Series on Statistics for Clinical Trials. , 2015, Journal of the American College of Cardiology.

[13]  S. Lange,et al.  Adjusting for multiple testing--when and how? , 2001, Journal of clinical epidemiology.

[14]  V. Preedy,et al.  Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials , 2010 .

[15]  S. Emerson,et al.  A Unifying Family of Group Sequential Test Designs , 1999, Biometrics.

[16]  T. Perneger What's wrong with Bonferroni adjustments , 1998, BMJ.

[17]  Gary G. Koch,et al.  Statistical Considerations for Multiplicity in Confirmatory Protocols , 1996 .

[18]  L. Mbuagbaw,et al.  A tutorial on sensitivity analyses in clinical trials: the what, why, when and how , 2013, BMC Medical Research Methodology.

[19]  G. Guyatt,et al.  PROphylaxis for ThromboEmbolism in Critical Care Trial protocol and analysis plan. , 2011, Journal of critical care.

[20]  A. Tsiatis,et al.  Approximately optimal one-parameter boundaries for group sequential trials. , 1987, Biometrics.

[21]  T. Hothorn,et al.  Multiple Comparisons Using R , 2010 .

[22]  A Whitehead,et al.  Stopping rules for phase II studies. , 2001, British journal of clinical pharmacology.

[23]  M. Parmar,et al.  More multiarm randomised trials of superiority are needed , 2014, The Lancet.

[24]  J. Wason,et al.  Correcting for multiple-testing in multi-arm trials: is it necessary and is it done? , 2014, Trials.

[25]  D. I. Cook,et al.  Subgroup analysis in clinical trials , 2004, The Medical journal of Australia.

[26]  I. Boutron,et al.  Reporting of analyses from randomized controlled trials with multiple arms: a systematic review , 2013, BMC Medicine.

[27]  J. McMurray,et al.  Statistical Controversies in Reporting of Clinical Trials: Part 2 of a 4-Part Series on Statistics for Clinical Trials. , 2015, Journal of the American College of Cardiology.

[28]  M. Clarke Education section – Studies Within A Trial (SWAT) , 2012, Journal of evidence-based medicine.

[29]  S. Pocock,et al.  The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities , 2011, European heart journal.

[30]  R. Ramlau,et al.  Randomized, multinational, phase III study of docetaxel plus platinum combinations versus vinorelbine plus cisplatin for advanced non-small-cell lung cancer: the TAX 326 study group. , 2003, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[31]  Y. Hochberg A sharper Bonferroni procedure for multiple tests of significance , 1988 .

[32]  R. Gray,et al.  Multi-Arm Clinical Trials of New Agents: Some Design Considerations , 2008, Clinical Cancer Research.

[33]  D. Altman,et al.  The Problem of Subgroup Analyses: An Example from a Trial on Ruptured Intracranial Aneurysms , 2011, American Journal of Neuroradiology.

[34]  Lisa M. Law,et al.  Use of an embedded, micro-randomised trial to investigate non-compliance in telehealth interventions , 2016, Clinical trials.

[35]  Mohamed Alosh,et al.  Advanced multiplicity adjustment methods in clinical trials , 2014, Statistics in medicine.

[36]  G. Guyatt,et al.  Dalteparin versus unfractionated heparin in critically ill patients. , 2011, The New England journal of medicine.

[37]  P. Armitage,et al.  Design and analysis of randomized clinical trials requiring prolonged observation of each patient. I. Introduction and design. , 1976, British Journal of Cancer.

[38]  R. Feise Do multiple outcome measures require p-value adjustment? , 2002, BMC medical research methodology.

[39]  H. Cabral,et al.  Multiple Comparisons Procedures , 2008, Circulation.

[40]  Sara T Brookes,et al.  Subgroup analyses in randomized trials: risks of subgroup-specific analyses; power and sample size for the interaction test. , 2004, Journal of clinical epidemiology.

[41]  David L. DeMets,et al.  Design and analysis of group sequential tests based on the type I error spending rate function , 1987 .

[42]  Douglas G. Altman,et al.  Practical statistics for medical research , 1990 .

[43]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[44]  S. Pocock Group sequential methods in the design and analysis of clinical trials , 1977 .

[45]  K. K. Lan,et al.  Discrete sequential boundaries for clinical trials , 1983 .

[46]  S. Ross Composite outcomes in randomized clinical trials: arguments for and against. , 2007, American journal of obstetrics and gynecology.

[47]  D. Gillen,et al.  Frequentist evaluation of group sequential clinical trial designs , 2007, Statistics in medicine.

[48]  H. Keselman,et al.  The analysis of repeated measures designs: a review. , 2001, The British journal of mathematical and statistical psychology.

[49]  Richard E. White,et al.  Analyzing multiple endpoints in clinical trials of pain treatments: IMMPACT recommendations , 2008, PAIN.

[50]  J Ludbrook,et al.  Repeated measurements and multiple comparisons in cardiovascular research. , 1994, Cardiovascular research.

[51]  P. O'Brien,et al.  A multiple testing procedure for clinical trials. , 1979, Biometrics.

[52]  How should clinicians interpret results reflecting the effect of an intervention on composite endpoints: Should I dump this lump? , 2005, ACP Journal Club.

[53]  Anastasios A. Tsiatis,et al.  Group sequential designs for one-sided and two-sided hypothesis testing with provision for early stopping in favor of the null hypothesis , 1994 .

[54]  G. Guyatt,et al.  Risk factors for and prediction of mortality in critically ill medical–surgical patients receiving heparin thromboprophylaxis , 2016, Annals of Intensive Care.

[55]  A Whitehead,et al.  Interim analyses and sequential designs in phase III studies. , 2001, British journal of clinical pharmacology.

[56]  J. Shaffer Multiple Hypothesis Testing , 1995 .

[57]  M. Meade,et al.  Co-enrollment of critically ill patients into multiple studies: patterns, predictors and consequences , 2013, Critical Care.

[58]  D. Cook,et al.  Thrombocytopenia in critically ill patients receiving thromboprophylaxis: frequency, risk factors, and outcomes. , 2013, Chest.

[59]  R. Dworkin,et al.  Reporting of primary analyses and multiplicity adjustment in recent analgesic clinical trials: ACTTION systematic review and recommendations , 2014, PAIN®.

[60]  S. Pocock,et al.  Clinical trials with multiple outcomes: a statistical perspective on their design, analysis, and interpretation. , 1997, Controlled clinical trials.

[61]  Gordon H Guyatt,et al.  Problems with use of composite end points in cardiovascular trials: systematic review of randomised controlled trials , 2007, BMJ : British Medical Journal.