Surrogate End Points in Clinical Trials: Are We Being Misled?

Clinical trials are the standard scientific method for evaluating a new biological agent, drug, device, or procedure for the prevention or treatment of disease in humans. The phase 3 trial is designed to evaluate a new agent's clinical benefit and possible side effects; as such, it is considered to be the definitive test of the agent's usefulness [1-3]. For phase 3 trials, the primary end point should be a clinical event relevant to the patient, that is, the event of which the patient is aware and wants to avoid. Examples are death, loss of vision, symptomatic events of the acquired immunodeficiency syndrome (AIDS), the need for ventilatory support, and other events causing a reduction in quality of life. Trials with these clinical outcomes often have a long duration and are expensive. As a consequence, there has recently been great interest in the development of alternative outcomes, or surrogate end points, to reduce the cost and shorten the duration of phase 3 trials [4-17]. As defined by Temple [13], a surrogate endpoint of a clinical trial is a laboratory measurement or a physical sign used as a substitute for a clinically meaningful endpoint that measures directly how a patient feels, functions or survives. Changes induced by a therapy on a surrogate endpoint are expected to reflect changes in a clinically meaningful endpoint. Examples of surrogate end points are increased CD4 cell counts or decreased viral load measures for trials of therapy for human immunodeficiency virus (HIV) infection or AIDS, suppression of ventricular arrhythmias or reduction in cholesterol level or blood pressure in cardiology trials, and tumor regression in trials of cancer therapy. Surrogate end points are rarely, if ever, adequate substitutes for the definitive clinical outcome in phase 3 trials. We review the basic requirements that the surrogate must meet to be used as the replacement outcome. Requirements for a Surrogate End Point A correlate does not a surrogate make. It is a common misconception that if an outcome is a correlate (that is, correlated with the true clinical outcome) it can be used as a valid surrogate end point (that is, a replacement for the true clinical outcome). However, proper justification for such replacement requires that the effect of the intervention on the surrogate end point predicts the effect on the clinical outcomea much stronger condition than correlation. Prentice [11] developed criteria that are sufficient to validate surrogate end points in phase 3 trials. These criteria essentially require that the surrogate must be a correlate of the true clinical outcome and fully capture the net effect of treatment on the clinical outcome. Although the first criterion is usually easy to verify, the second is not. For example, several recent trials on HIV and AIDS [14-24] showed that the second criterion is not satisfied when CD4 cell count is used as a surrogate end point for development of symptomatic AIDS events or death. Several factors, illustrated in Figure 1, may explain the failure of surrogate end points. Although it may be a correlate of disease progression (Figure 1A), a surrogate end point might not involve the same pathophysiologic process that results in the clinical outcome. Even when it does, some disease pathways are probably causally related to the clinical outcome and not related to the surrogate end point. Of the disease pathways affecting the true clinical outcome, the intervention may only affect the pathway mediated through the surrogate end point (Figure 1B) or the pathway or pathways independent of the surrogate end point (Figure 1C). Most important, the intervention might also affect the true clinical outcome by unintended mechanisms of action that are independent of the disease process (Figure 1D). The effects of the intervention mediated through intended mechanisms could be substantially offset by unintended, unanticipated, or unrecognized mechanisms [25]. Figure 1. Reasons for failure of surrogate end points. Figure 2 illustrates the setting that provides the greatest potential for the validity of the surrogate end point. Specifically, the surrogate is in the only causal pathway of the disease process, and the intervention's entire effect on the true clinical outcome is mediated through its effect on the surrogate. Even in this ideal setting, however, surrogate end points can yield misleading conclusions. The intervention's effect on the true clinical end point could be underestimated if there is considerable noise in the measurement of effects on the surrogate end point. The effect on the true end point could be overestimated if the effect on the surrogate, although statistically significant, is not of sufficient size or duration to meaningfully alter the true clinical outcome. This overestimation could readily arise, for example, in the ongoing evaluation of protease inhibitors in HIV-infected patients, in which effects on the surrogate end point (viral RNA levels in the peripheral blood) are substantial but of only short duration. Figure 2. The setting that provides the greatest potential for the surrogate end point to be valid. A review of recent experiences with surrogates is sobering, revealing many cases for which biological markers were correlates of clinical outcomes but failed to predict the effect of treatment on the clinical outcome. In the next section, we examine the failure of surrogates in several clinical trial settings by disease area. We can only speculate about the reasons for these failures because, even in retrospect, our understanding of the causal pathways of the disease process and the mechanisms of action of the intervention is incomplete. Table 1 provides such speculation, according to the possible explanations provided in Figure 1. Table 1. Speculation on Reasons for Failures of Surrogate End Points* Surrogate End Points in Cardiology Arrhythmia Suppression Use of reduction in ventricular ectopic contractions as a surrogate for decreased cardiovascular-related mortality provides a classic example of the unreliability of surrogate end points. Ventricular arrhythmia is associated with an almost fourfold increase in the risk for death related to cardiac complications [26, 27], particularly sudden death. It was hypothesized that suppression of ventricular arrhythmias after myocardial infarction would reduce the rate of death. Three new drugs (encainide, flecainide, and moricizine) were found to suppress arrhythmias effectively and were approved by the Food and Drug Administration (FDA) for use in patients with life-threatening or severely symptomatic arrhythmias. Although follow-up trials had not been done to determine whether the reduction in arrhythmias would lead to a reduction in death rates, more than 200 000 persons per year eventually took these drugs in the United States. The Cardiac Arrhythmia Suppression Trial (CAST) [26-28] evaluated how the three drugs would affect survival of patients who had had myocardial infarction and had at least 10 premature ventricular beats per hour. The early results from CAST were startling. The encainide and flecainide arms of the trial were terminated early when 33 sudden deaths occurred in patients taking either drug compared with only 9 in the matching placebo group. A total of 56 patients in the encainide and flecainide group died, and 22 patients in the placebo group died. After the data were finalized, the sudden death comparison was 43 and 16 and the number of deaths was 63 in the encainide and flecainide group and 26 in the placebo group. Later results from CAST also established an increased risk for death in patients receiving moricizine [28]. Two other examples are relevant to the arrhythmia setting. Quinidine has been used to maintain sinus rhythm after patients with atrial fibrillation have been converted [29]. A meta-analysis of six trials indicated that quinidine maintained sinus rhythm at 1 year (50% of patients who received quinidine compared with 25% of those who did not) but increased the mortality rate from 0.8% to 2.9%. Preventing recurrence of atrial fibrillation is an important benefit, but it does not outweigh the increased mortality rate. Similar inconsistencies were found for lidocaine; a meta-analysis showed that a one-third reduction in the risk for ventricular tachycardia was accompanied by a one-third increase in death rate [30, 31]. Exercise Tolerance in Congestive Heart Failure Patients with congestive heart failure have decreased cardiac output, characteristic symptoms of dyspnea and orthopnea, decreased exercise capacity, and a high risk for death. The annual mortality rate for patients with severe congestive heart failure is 20% to 40%. The poor exercise performance is presumed to be a result of decreased cardiac output, but it could also result from increased pulmonary vascular pressure. In this disease, cardiac output and ejection fraction have been used as surrogate end points for examining the usefulness of new drugs, and exercise tolerance and symptomatic improvement have also been regularly assessed as intermediate end points. Although some treatments that affect these end points produce improved survival [32-35], others provide no benefit or actually decrease survival. Diuretics and digoxin help alleviate symptoms. No data on the survival effects of these treatments have yet been published, although results of the recently reported Digitalis Investigation Group trial [36] show no survival benefit (American College of Cardiology, March 1996. Unpublished data). One of the earlier drugs that was proposed as a treatment for congestive heart failure was milrinone. Completed studies indicated that milrinone improved cardiac output and increased exercise tolerance. This drug is an inotropic agent (as is digoxin) that stimulates the force of contraction of the heart. Because the FDA was concerned that such agents may have adverse long-term effects (as was the case for -agonis

[1]  V De Gruttola,et al.  Estimating the proportion of treatment effect explained by a surrogate marker. , 1997, Statistics in medicine.

[2]  T. Habermann,et al.  Granulocyte colony-stimulating factor in severe chemotherapy-induced afebrile neutropenia. , 1997, The New England journal of medicine.

[3]  B. Leroux,et al.  Evaluating the Validity of Probing Attachment Loss as a Surrogate for Tooth Mortality in a Clinical Trial on the Elderly , 1997, Journal of dental research.

[4]  T. Fleming,et al.  Perspective: validating surrogate markers--are we being naive? , 1997, The Journal of infectious diseases.

[5]  A. J. Man in 't Veld Surrogate end points in clinical trials. , 1997, Blood pressure. Supplement.

[6]  D. M. Barr,et al.  Rationale, design, implementation, and baseline characteristics of patients in the DIG trial: a large, simple, long-term trial to evaluate the effect of digitalis on mortality in heart failure. , 1996, Controlled clinical trials.

[7]  T. Raghunathan,et al.  The risk of myocardial infarction associated with antihypertensive drug therapies. , 1995, JAMA.

[8]  T. Derouen,et al.  A survey of endpoint characteristics in periodontal clinical trials published 1988-1992, and implications for future studies. , 1995, Journal of clinical periodontology.

[9]  N. Day,et al.  Surrogate markers in AIDS and cancer trials. , 1995, Statistics in medicine.

[10]  L A Moyé,et al.  The cardiac arrhythmia suppression trial. Casting suppression in a different light. , 1995, Circulation.

[11]  Scandinavian Simvastatin Survival Study Group Randomised trial of cholesterol lowering in 4444 patients with coronary heart disease: the Scandinavian Simvastatin Survival Study (4S) , 1994, The Lancet.

[12]  Thomas R. Fleming,et al.  Auxiliary outcome data and the mean score method , 1994 .

[13]  J. Phair,et al.  The duration of zidovudine benefit in persons with asymptomatic HIV infection. Prolonged evaluation of protocol 019 of the AIDS Clinical Trials Group. , 1994, JAMA.

[14]  M S Pepe,et al.  Surrogate and auxiliary endpoints in clinical trials, with potential applications in cancer and AIDS research. , 1994, Statistics in medicine.

[15]  C. Furberg,et al.  Overtreatment and undertreatment of hypertension , 1994, Journal of internal medicine.

[16]  M. Law,et al.  Assessing possible hazards of reducing serum cholesterol , 1994, BMJ.

[17]  N J Wald,et al.  By how much and how quickly does reduction in serum cholesterol concentration lower risk of ischaemic heart disease? , 1994, BMJ.

[18]  T R Fleming,et al.  Surrogate markers in AIDS and cancer trials. , 1994, Statistics in medicine.

[19]  M. Kosorok,et al.  Using surrogate failure time data to increase cost effectiveness in clinical trials , 1993 .

[20]  K. Holmes,et al.  Antiretroviral therapy for adult HIV-infected patients. Recommendations from a state-of-the-art conference. National Institute of Allergy and Infectious Diseases State-of-the-Art Panel on Anti-Retroviral Therapy for Adult HIV-Infected Patients. , 1993, JAMA.

[21]  R. Bain,et al.  Effects of vesnarinone on morbidity and mortality in patients with heart failure , 1993 .

[22]  R. Massof,et al.  Supplemental vitamin A retards loss of ERG amplitude in retinitis pigmentosa. , 1993, Archives of ophthalmology.

[23]  D. Lin,et al.  Evaluating the role of CD4-lymphocyte counts as surrogate endpoints in human immunodeficiency virus clinical trials. , 1993, Statistics in medicine.

[24]  Robert Schooley,et al.  CD4+ Lymphocytes Are an Incomplete Surrogate Marker for Clinical Progression in Persons with Asymptomatic HIV Infection Taking Zidovudine , 1993, Annals of Internal Medicine.

[25]  J. Aboulker,et al.  Preliminary analysis of the Concorde trial , 1993, The Lancet.

[26]  J. Aboulker,et al.  Preliminary analysis of the Concorde trial. Concorde Coordinating Committee. , 1993, Lancet.

[27]  Frans Van de Werf,et al.  An international randomized trial comparing four thrombolytic strategies for acute myocardial infarction. , 1993, The New England journal of medicine.

[28]  B. W. Nicholson,et al.  A randomized trial of vitamin A and vitamin E supplementation for retinitis pigmentosa. , 1993, Archives of ophthalmology.

[29]  I. Holme Relation of coronary heart disease incidence and total mortality to plasma cholesterol reduction in randomised trials: use of meta-analysis. , 1993, British heart journal.

[30]  T. Fleming [Evaluating Therapeutic Interventions: Some Issues and Experiences]: Rejoinder , 1992 .

[31]  Thomas R. Fleming,et al.  Evaluating Therapeutic Interventions: Some Issues and Experiences , 1992 .

[32]  E. J. Brown,et al.  Effect of captopril on mortality and morbidity in patients with left ventricular dysfunction after myocardial infarction. Results of the survival and ventricular enlargement trial. The SAVE Investigators. , 1992, The New England journal of medicine.

[33]  S W Lagakos,et al.  Surrogate Markers in AIDS: Where Are We? Where Are We Going? , 1992, Annals of Internal Medicine.

[34]  P. Piedbois Modulation of fluorouracil by leucovorin in patients with advanced colorectal cancer: evidence in terms of response rate. Advanced Colorectal Cancer Meta-Analysis Project. , 1992, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[35]  B. Graubard,et al.  Statistical validation of intermediate endpoints for chronic diseases. , 1992, Statistics in medicine.

[36]  Robert Lemery,et al.  Effect of the antiarrhythmic agent moricizine on survival after myocardial infarction. , 1992, The New England journal of medicine.

[37]  D. DeMets,et al.  Effect of oral milrinone on mortality in severe chronic heart failure. The PROMISE Study Research Group. , 1991, The New England journal of medicine.

[38]  G. Tucker,et al.  Clinical Measurement in Drug Evaluation , 1991 .

[39]  H L Greene,et al.  Mortality and morbidity in patients receiving encainide, flecainide, or placebo. The Cardiac Arrhythmia Suppression Trial. , 1991, The New England journal of medicine.

[40]  P Bacchetti,et al.  Surrogate markers for survival in patients with AIDS and AIDS related complex treated with zidovudine. , 1991, BMJ.

[41]  John I. Gallin,et al.  A controlled trial of interferon gamma to prevent infection in chronic granulomatous disease. The International Chronic Granulomatous Disease Cooperative Study Group. , 1991, The New England journal of medicine.

[42]  Prevention of stroke by antihypertensive drug treatment in older persons with isolated systolic hypertension. Final results of the Systolic Hypertension in the Elderly Program (SHEP). SHEP Cooperative Research Group. , 1991, JAMA.

[43]  B. Rifkind,et al.  The value of lowering cholesterol after myocardial infarction. , 1990, The New England journal of medicine.

[44]  E. Antman,et al.  Efficacy and safety of quinidine therapy for maintenance of sinus rhythm after cardioversion. A meta-analysis of randomized control trials. , 1990, Circulation.

[45]  K. Matthews,et al.  Lowering cholesterol concentrations and mortality: a quantitative review of primary prevention trials. , 1990, BMJ.

[46]  W M O'Fallon,et al.  Effect of fluoride treatment on the fracture rate in postmenopausal women with osteoporosis. , 1990, The New England journal of medicine.

[47]  Gruppo Italiano per lo Studio della Soprawivenza nell'Inf Miocardico. MEDICAL SCIENCE GISSI-2: A factorial randomised trial of alteplase versus streptokinase and heparin versus no heparin among 12 490 patients with acute myocardial infarction , 1990, The Lancet.

[48]  R. Collins,et al.  Blood pressure, stroke, and coronary heart disease Part 2, short-term reductions in blood pressure: overview of randomised drug trials in their epidemiological context , 1990, The Lancet.

[49]  S W Lagakos,et al.  Zidovudine in Asymptomatic Human Immunodeficiency Virus Infection , 1990 .

[50]  M. Weir,et al.  The Cardiac Arrhythmia Suppression Trial Investigators: Preliminary Report: Effect of Encainide and Flecainide on Mortality in a Randomized Trial of Arrhythmia Suppression After Myocardial Infarction. , 1990 .

[51]  T. Fleming Evaluation of active control trials in AIDS. , 1990, Journal of acquired immune deficiency syndromes.

[52]  M. Gail,et al.  On the use of laboratory markers as surrogates for clinical endpoints in the evaluation of treatment for HIV infection. , 1990, Journal of acquired immune deficiency syndromes.

[53]  N. Laird,et al.  Meta-analytic evidence against prophylactic use of lidocaine in acute myocardial infarction. , 1989, Archives of internal medicine.

[54]  C. Furberg,et al.  Calcium channel blockers in acute myocardial infarction and unstable angina: an overview. , 1989, BMJ.

[55]  J. Herson The use of surrogate endpoints in clinical trials (an introduction to a series of four papers) , 1989 .

[56]  R. Prentice Surrogate endpoints in clinical trials: definition and operational criteria. , 1989, Statistics in medicine.

[57]  J. Wittes,et al.  Surrogate endpoints in clinical trials: cardiovascular diseases. , 1989, Statistics in medicine.

[58]  S. Ellenberg,et al.  Surrogate endpoints in clinical trials: cancer. , 1989, Statistics in medicine.

[59]  D. Seigel,et al.  Surrogate endpoints in clinical trials: ophthalmologic disorders. , 1989, Statistics in medicine.

[60]  W. Rogers,et al.  Preliminary report: effect of encainide and flecainide on mortality in a randomized trial of arrhythmia suppression after myocardial infarction. , 1989, The New England journal of medicine.

[61]  R. Collins,et al.  Effects of prophylactic lidocaine in suspected acute myocardial infarction. An overview of results from the randomized, controlled trials. , 1988, JAMA.

[62]  G. Comstock Identification of an effective vaccine against tuberculosis. , 1988, The American review of respiratory disease.

[63]  K. Swedberg,et al.  Effects of enalapril on mortality in severe congestive heart failure: results of the Cooperative North Scandinavian Enalapril Survival Study (CONSENSUS). , 1988, The American journal of cardiology.

[64]  W F Taylor,et al.  Lung cancer screening: the Mayo program. , 1986, Journal of occupational medicine. : official publication of the Industrial Medical Association.

[65]  Gruppo Italiano per lo Studio della Soprawivenza nell'Inf Miocardico.,et al.  EFFECTIVENESS OF INTRAVENOUS THROMBOLYTIC TREATMENT IN ACUTE MYOCARDIAL INFARCTION , 1986, The Lancet.

[66]  W. Ganz,et al.  The thrombolysis in myocardial infarction (TIMI) trial. , 1985, The New England journal of medicine.

[67]  R. Temple,et al.  Food and Drug Administration requirements for approval of new anticancer drugs. , 1985, Cancer treatment reports.

[68]  H. S. Mueller,et al.  The Thrombolysis in Myocardial Infarction (TIMI) trial. Phase I findings. , 1985, The New England journal of medicine.

[69]  W F Taylor,et al.  Early lung cancer detection: results of the initial (prevalence) radiologic and cytologic screening in the Mayo Clinic study. , 2015, The American review of respiratory disease.

[70]  C. Moertel Improving the efficiency of clinical trials: a medical perspective. , 1984, Statistics in medicine.

[71]  R. Shanks,et al.  Prophylactic lidocaine in suspected acute myocardial infarction. , 1984, International journal of cardiology.

[72]  Lippincott Williams Wilkins,et al.  Coronary artery surgery study (CASS): a randomized trial of coronary artery bypass surgery. Survival data. , 1983, Circulation.

[73]  W. O'Fallon,et al.  Effect of the fluoride/calcium regimen on vertebral fracture occurrence in postmenopausal osteoporosis. Comparison with conventional therapy. , 1982, The New England journal of medicine.

[74]  Continuous or nocturnal oxygen therapy in hypoxemic chronic obstructive lung disease: a clinical trial. Nocturnal Oxygen Therapy Trial Group. , 1980, Annals of internal medicine.

[75]  J. Stamler Clofibrate and niacin in coronary heart disease. , 1975, JAMA.