Multivariable risk prediction can greatly enhance the statistical power of clinical trial subgroup analysis

BackgroundWhen subgroup analyses of a positive clinical trial are unrevealing, such findings are commonly used to argue that the treatment's benefits apply to the entire study population; however, such analyses are often limited by poor statistical power. Multivariable risk-stratified analysis has been proposed as an important advance in investigating heterogeneity in treatment benefits, yet no one has conducted a systematic statistical examination of circumstances influencing the relative merits of this approach vs. conventional subgroup analysis.MethodsUsing simulated clinical trials in which the probability of outcomes in individual patients was stochastically determined by the presence of risk factors and the effects of treatment, we examined the relative merits of a conventional vs. a "risk-stratified" subgroup analysis under a variety of circumstances in which there is a small amount of uniformly distributed treatment-related harm. The statistical power to detect treatment-effect heterogeneity was calculated for risk-stratified and conventional subgroup analysis while varying: 1) the number, prevalence and odds ratios of individual risk factors for risk in the absence of treatment, 2) the predictiveness of the multivariable risk model (including the accuracy of its weights), 3) the degree of treatment-related harm, and 5) the average untreated risk of the study population.ResultsConventional subgroup analysis (in which single patient attributes are evaluated "one-at-a-time") had at best moderate statistical power (30% to 45%) to detect variation in a treatment's net relative risk reduction resulting from treatment-related harm, even under optimal circumstances (overall statistical power of the study was good and treatment-effect heterogeneity was evaluated across a major risk factor [OR = 3]). In some instances a multi-variable risk-stratified approach also had low to moderate statistical power (especially when the multivariable risk prediction tool had low discrimination). However, a multivariable risk-stratified approach can have excellent statistical power to detect heterogeneity in net treatment benefit under a wide variety of circumstances, instances under which conventional subgroup analysis has poor statistical power.ConclusionThese results suggest that under many likely scenarios, a multivariable risk-stratified approach will have substantially greater statistical power than conventional subgroup analysis for detecting heterogeneity in treatment benefits and safety related to previously unidentified treatment-related harm. Subgroup analyses must always be well-justified and interpreted with care, and conventional subgroup analyses can be useful under some circumstances; however, clinical trial reporting should include a multivariable risk-stratified analysis when an adequate externally-developed risk prediction tool is available.

[1]  D. Lubeck,et al.  Quantifying comorbidity in a disease-specific cohort: adaptation of the total illness burden index to prostate cancer. , 1999, Urology.

[2]  W. Browner Willy Sutton and the number needed to treat. , 2004, The American journal of medicine.

[3]  Frans Van de Werf,et al.  An international randomized trial comparing four thrombolytic strategies for acute myocardial infarction. , 1993, The New England journal of medicine.

[4]  Peter Fayers,et al.  Can overall results of clinical trials be applied to all patients? , 1995, The Lancet.

[5]  D. Kent,et al.  Reporting clinical trial results to inform providers, payers, and consumers. , 2005, Health affairs.

[6]  J. Concato,et al.  Randomized, controlled trials, observational studies, and the hierarchy of research designs. , 2000, The New England journal of medicine.

[7]  A. H. Feiveson,et al.  Power by Simulation , 2002 .

[8]  Richard L Kravitz,et al.  Evidence-based medicine, heterogeneity of treatment effects, and the trouble with averages. , 2004, The Milbank quarterly.

[9]  R S Hayward,et al.  Users' guides to the medical literature. VIII. How to use clinical practice guidelines. A. Are the recommendations valid? The Evidence-Based Medicine Working Group. , 1995, JAMA.

[10]  S. Gutnikov,et al.  From subgroups to individuals: general principles and the example of carotid endarterectomy , 2005, The Lancet.

[11]  R J Cook,et al.  Users' guides to the medical literature. IX. A method for grading health care recommendations. Evidence-Based Medicine Working Group. , 1995, JAMA.

[12]  E. Braunwald,et al.  Comparison of early invasive and conservative strategies in patients with unstable coronary syndromes treated with the glycoprotein IIb/IIIa inhibitor tirofiban. , 2001, The New England journal of medicine.

[13]  R M Arnold,et al.  Absolutely relative: how research results are summarized can affect treatment decisions. , 1992, The American journal of medicine.

[14]  W. Knaus,et al.  Prediction of Survival for Older Hospitalized Patients: The HELP Survival Model , 2000, Journal of the American Geriatrics Society.

[15]  Gordon H. Guyatt,et al.  How to Use an Article About Therapy or Prevention , 1995 .

[16]  A. Feinstein,et al.  Problems in the "evidence" of "evidence-based medicine". , 1997, The American journal of medicine.

[17]  U. Maggiore,et al.  Predicting patient outcome from acute renal failure comparing three general severity of illness scoring systems. , 2000, Kidney international.

[18]  Sara T Brookes,et al.  Subgroup analyses in randomized trials: risks of subgroup-specific analyses; power and sample size for the interaction test. , 2004, Journal of clinical epidemiology.

[19]  G. Guyatt,et al.  Users' Guides to the Medical Literature: IX. A Method for Grading Health Care Recommendations , 1995 .

[20]  Kirit Patel,et al.  Simple Bedside Additive Tool for Prediction of In-Hospital Mortality After Percutaneous Coronary Interventions , 2001, Circulation.

[21]  W. Browner,et al.  Improving the prediction of coronary heart disease to aid in the management of high cholesterol levels: what a difference a decade makes. , 1998, JAMA.

[22]  John A. Baron,et al.  The framing effect of relative and absolute risk , 1993, Journal of General Internal Medicine.

[23]  François Gueyffier,et al.  A score for predicting risk of death from cardiovascular disease in adults with raised blood pressure, based on individual patient data from randomised controlled trials , 2001, BMJ : British Medical Journal.

[24]  D. Sackett,et al.  On the need for evidence-based medicine. , 1995, Journal of public health medicine.

[25]  W. Knaus,et al.  Evaluation of acute physiology and chronic health evaluation III predictions of hospital mortality in an independent database. , 1998, Critical care medicine.

[26]  A R Feinstein,et al.  Monte Carlo methods in clinical research: applications in multivariable analysis. , 1997, Journal of investigative medicine : the official publication of the American Federation for Clinical Research.

[27]  Sandeep Vijan,et al.  Pharmacologic Lipid-Lowering Therapy in Type 2 Diabetes Mellitus: Background Paper for the American College of Physicians , 2004, Annals of Internal Medicine.

[28]  G. Guyatt,et al.  Users' guides to the medical literature. , 1993, JAMA.

[29]  R. Hayward,et al.  Identifying poor-quality hospitals. Can hospital mortality rates detect quality problems for medical diagnoses? , 1996, Medical care.

[30]  D. Kent,et al.  Are randomized controlled trials sufficient evidence to guide clinical practice in Type II (non-insulin-dependent) diabetes mellitus? , 2000, Diabetologia.

[31]  Ellen Fineout-Overholt,et al.  Users' Guides to the Medical Literature , 2002 .

[32]  G. Slotman Prospectively validated prediction of organ failure and hypotension in patients with septic shock: the Systemic Mediator Associated Response Test (SMART). , 2000, Shock.

[33]  Gordon H. Guyatt,et al.  Users' Guides to the Medical Literature: VIII. How to Use Clinical Practice Guidelines A. Are the Recommendations Valid? , 1995 .

[34]  G H Guyatt,et al.  Users' guides to the medical literature. II. How to use an article about therapy or prevention. B. What were the results and will they help me in caring for my patients? Evidence-Based Medicine Working Group. , 1994, JAMA.

[35]  C. Warlow,et al.  Prediction of benefit from carotid endar terectomy in individual patients: a risk-modelling study , 1999, The Lancet.

[36]  M. Altschuler,et al.  Using the receiver operating characteristic curve to select pretreatment and pathologic predictors for early and late postprostatectomy PSA failure. , 2001, Urology.

[37]  S. Satya‐Murti Evidence-based Medicine: How to Practice and Teach EBM , 1997 .

[38]  J. Poloniecki,et al.  Operative mortality in colorectal cancer: prospective national study , 2003, BMJ : British Medical Journal.

[39]  R. Califf,et al.  An independently derived and validated predictive model for selecting patients with myocardial infarction who are likely to benefit from tissue plasminogen activator compared with streptokinase. , 2002, The American journal of medicine.

[40]  R. Hayward,et al.  Cost-utility analysis of screening intervals for diabetic retinopathy in patients with type 2 diabetes mellitus. , 2000, JAMA.

[41]  Gordon H. Guyatt,et al.  Users' Guides to the Medical Literature: II. How to Use an Article About Therapy or Prevention B. What Were the Results and Will They Help Me in Caring for My Patients? , 1994 .

[42]  Charles Maynard,et al.  Patient-Specific Predictions of Outcomes in Myocardial Infarction for Real-Time Emergency Use: A Thrombolytic Predictive Instrument , 1997, Annals of Internal Medicine.

[43]  P. Rothwell,et al.  External validity of randomised controlled trials: “To whom do the results of this trial apply?” , 2005, The Lancet.

[44]  R. Hayward,et al.  Estimated Benefits of Glycemic Control in Microvascular Complications in Type 2 Diabetes , 1997, Annals of Internal Medicine.

[45]  J. Lau,et al.  The impact of high-risk patients on the results of clinical trials. , 1997, Journal of clinical epidemiology.

[46]  T. Peters,et al.  Subgroup analyses in randomised controlled trials: quantifying the risks of false-positives and false-negatives. , 2001, Health technology assessment.

[47]  Jeffrey M Albert,et al.  Assessing Treatment Effect Heterogeneity in Clinical Trials with Blocked Binary Outcomes , 2005, Biometrical journal. Biometrische Zeitschrift.

[48]  Harry P. Selker,et al.  A Tool for Judging Coronary Care Unit Admission Appropriateness, Valid for Both Real-Time and Retrospective Use: A Time-Insensitive Predictive Instrument (TIPI) for Acute Cardiac Ischemia: A Multicenter Study , 1991, Medical care.

[49]  C. Vassanelli,et al.  [Comparison of early invasive and conservative strategies in patients with unstable coronary syndromes treated with the glycoprotein IIb/IIIa inhibitor tirofiban]. , 2001, Italian heart journal. Supplement : official journal of the Italian Federation of Cardiology.

[50]  D. Kent,et al.  Are Some Patients Likely to Benefit From Recombinant Tissue-Type Plasminogen Activator for Acute Ischemic Stroke Even Beyond 3 Hours From Symptom Onset? , 2003, Stroke.

[51]  P. Rothwell Subgroup analysis in randomised controlled trials: importance, indications, and interpretation , 2005, The Lancet.