Clinical practice guidelines facilitate implementation of high-value health care when they are based on consideration of the benefits and harms most relevant to practitioners, their patients, and society. The Institute of Medicine has specified that guidelines should be based on systematic reviews that consider the quality, quantity, and consistency of the relevant evidence (1). This approach is supported internationally (2). In practice, findings from systematic reviews may not apply directly to the guideline development setting. For example, cancer screening guideline panels may need to determine not only whether they should recommend screening for a specific condition but, if so, the ages at which to start and stop, the frequency, and the test method. For many screening strategies, data to directly address these questions are lacking. One way to bridge the gap between primary evidence and guideline development is by using models. Models are mathematical frameworks that integrate available data to estimate the health consequences of alternative intervention strategies in patient populations. There are different classes of models with different goals, methodological approaches, or both (310). Models have been used to examine the natural history of disease, explain disease occurrence trends, and interrogate harmbenefit tradeoffs of competing policies. Some models express the entire disease process and outcomes at a population level; others (microsimulation models) attempt to construct a virtual population in which persons progress through a disease process. We focus on modeling to estimate the harmbenefit tradeoffs of different disease management strategies. This requires the existence of a calibrated model of disease progressionthat is, a representation of disease progression that is shown to be consistent with observed data. For example, in cancer screening we need a model of disease without screening that yields projections of disease incidence similar to those observed in the absence and presence of screening. Although the use of models is increasing in guideline development, many guidelines are created without them. In some cases, they are not needed because guideline questions can be adequately addressed by using published primary evidence. However, in many cases, an understanding of how modeling can provide useful information is lacking. This article proposes that models play an important role in integrating and extending the evidence on outcomes of health care interventions. We provide recommendations of when models are likely to be valuable, based on gaps between published research studies and guideline questions. We also discuss aspects of model quality. Finally, we provide direction for how a modeling study should be designed and integrated into the guideline development process. Examples of Models We use 2 examples from cancer screening to illustrate our primary points. Screening trials provide primary evidence on benefit but are unable to compare the full range of screening strategies; represent a screening program as it would be broadly implemented; estimate benefits over a lifetime horizon; or fully assess benefits, harms, and costs. To overcome these limitations, models have been used to provide this critical information to support the development of cancer screening recommendations (11). Colorectal Cancer Screening Colorectal cancer is one cancer type for which there is broad consensus on screening efficacy. Trials of fecal occult blood tests (FOBTs) have shown significant decreases in colorectal cancer deaths (12). However, disease management and testing technologies have changed since the trials began nearly 40 years ago. For example, new FOBT variants, including Hemoccult SENSA (Beckman Coulter) and immunochemical tests, are available, and use of colonoscopy for screening has increased. No randomized studies of these newer approaches have been conducted, although estimates of their performance have been published (12). A study used 2 models to calculate the number of life-years gained (measure of benefits) and the number of diagnostic colonoscopies (measure of harms and resource use) and to compare different screening ages and intervals for available screening tests (13). The models superimposed candidate screening tests with established performance characteristics on representations of adenoma onset, progression to colorectal cancer, and cancer progression. The models reproduced disease incidence trends in published screening trials (13). The results provided evidence for starting screening at age 50 years rather than 40 or 60 years and for stopping at age 75 years rather than 85 years. They supported a 10-year screening interval for colonoscopy and a 1-year interval for high-sensitivity FOBTs. In the models, the Hemoccult II FOBT (Beckman Coulter) had an inferior harmbenefit ratio compared with more recent FOBTs (13). Screening strategies recommended by the U.S. Preventive Services Task Force (USPSTF) were informed by the model results (14). Mammography Screening Many mammography screening trials have been conducted worldwide (15). Most indicated that screening reduces breast cancer mortality, but the trials enrolled women of different ages, used varying screening intervals, and had limited numbers of screening rounds. They also used film rather than the newer digital mammography technology and predated contemporary cancer therapies, such as tamoxifen and trastuzumab (15). Therefore, the trials were of limited value for contemporary settings. To inform guideline development (16), a modeling study was used to compare benefits and harms of mammography screening with different starting and stopping ages and screening intervals (17). The models combined previously estimated disease natural history with sensitivity estimates of current mammography tests. Many screening approaches were evaluated. The models indicated that a strategy of biennial mammography for women aged 50 to 69 years maintained an average of 81% of the benefit of annual mammography with half the number of false-positive results. For younger starting ages, the models indicated that initiating biennial screening at age 40 (vs. 50) years reduced mortality by an additional 3%, consumed more resources, and yielded more false-positive results. The USPSTF used the model results to inform its recommendations for biennial screening between ages 50 and 74 years, with individualized decision making before age 50 years and after age 74 years (16). Which Gaps Between Primary Evidence and Guidelines Can Be Addressed by Using Models? In both examples, a clear gap existed between the primary evidence from randomized, controlled trials (RCTs) and the evidence needed to develop clinical guidelines, and modeling bridged the gap. The examples help outline 4 areas in which models can be useful (Table 1). Table 1. Four Areas Where Models Can Bridge the Gap Between Primary Evidence and Guideline Development Apply New or Updated Information on Disease Risk, Tests, and Treatments Estimates of mortality benefit from breast cancer screening are derived from trials that started decades ago. However, advances in treatment have decreased disease-specific mortality over this same period, thus potentially reducing benefits of early disease detection. Models can project outcomes by using more recent mortality rates from information about the effect of new treatments on mortality and contemporary life tables (13, 17). In addition, screening tests may change over time. Digital mammography has largely replaced film, and new immunochemical FOBTs have been developed since the initial trials. Models calibrated to disease incidence under older screening technologies can incorporate these newer data sources to provide outcome estimates that are more relevant to current practice. Explore a Wide Range of Possible Intervention Strategies Many potential screening strategies can be considered that are defined by ages at which to start and stop, intervals, methods, and referral criteria. However, trials can test only a few screening strategies and have limited follow-up. For example, the 7 mammography trials in the systematic review used by the USPSTF had a median of 4 to 5 screening rounds (15), whereas population screening programs often use longer periods. To inform the USPSTF guidelines, the mammography models compared 20 strategies (17). For colorectal cancer, with its many possible screening methods, the models compared 145 strategies (13). Assess Important Benefits, Harms, and Costs Over the Lifetime of the Population Models can extend beyond published studies to evaluate new measures of harmbenefit tradeoffs and extrapolate effects on health outcomes beyond the study time horizons. Costs can also be incorporated. Although trials of FOBTs had follow-up exceeding 10 years, they underestimated the lifetime incidence and mortality effects of screening because adenoma detection and removal by colonoscopy after an abnormal FOBT result may prevent colon cancer decades later. Outcomes projected by the colorectal and breast cancer screening models reflected complete screening schedules over lifetime horizons. They included not only the numbers of screening tests, unnecessary biopsies, cancer cases, and cancer deaths prevented but also outcomes difficult to directly measure in trials, including the number of life-years gained and overdiagnosis. Project Outcomes for the Conditions for Which the Guideline Is Intended Participants in trials are often highly selected. Specific populations of interest not enrolled or underrepresented in trials may include racial or ethnic minorities, persons younger or older than most trial participants, and those with comorbid conditions. Also, health care conditions and outcomes in settings or countries where trials have been done may differ from those where guidelines apply. For example, prostate cancer incidence rates are higher in the Unit
[1]
Anya Okhmatovskaia,et al.
Validation of population-based disease simulation models: a review of concepts and methods
,
2010,
BMC public health.
[2]
G. Fonarow,et al.
ACC/AHA statement on cost/value methodology in clinical practice guidelines and performance measures: a report of the American College of Cardiology/American Heart Association Task Force on Performance Measures and Task Force on Practice Guidelines.
,
2014,
Circulation.
[3]
J. Caro,et al.
Model transparency and validation: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force--7.
,
2012,
Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.
[4]
Neff Walker,et al.
Mathematical models in the evaluation of health programmes
,
2011,
The Lancet.
[5]
M J Buxton,et al.
Modelling in economic evaluation: an unavoidable fact of life.
,
1997,
Health economics.
[6]
L. Shaw,et al.
ACC/AHA statement on cost/value methodology in clinical practice guidelines and performance measures: a report of the American College of Cardiology/American Heart Association Task Force on Performance Measures and Task Force on Practice Guidelines.
,
2014,
Journal of the American College of Cardiology.
[7]
A S Whittemore,et al.
Prostate cancer incidence and mortality in the United States and the United Kingdom.
,
1998,
Journal of the National Cancer Institute.
[8]
Cancer,et al.
Once-only flexible sigmoidoscopy screening in prevention of colorectal cancer: a multicentre randomised controlled trial
,
2010,
The Lancet.
[9]
H. D. de Koning,et al.
Prostate-Specific Antigen Screening in the United States vs in the European Randomized Study of Screening for Prostate Cancer–Rotterdam
,
2010,
Journal of the National Cancer Institute.
[10]
Ewout W Steyerberg,et al.
How much colonoscopy screening should be recommended to individuals with various degrees of family history of colorectal cancer?
,
2011,
Cancer.
[11]
Uwe Siebert,et al.
Modeling good research practices--overview: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force--1.
,
2012,
Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.
[12]
D. Berry,et al.
Effect of screening and adjuvant therapy on mortality from breast cancer.
,
2006,
The New England journal of medicine.
[13]
Amy B. Knudsen,et al.
Evaluating Test Strategies for Colorectal Cancer Screening: A Decision Analysis for the U.S. Preventive Services Task Force
,
2008,
Annals of Internal Medicine.
[14]
Mirjam Kretzschmar,et al.
Dynamic Transmission Modeling: A Report of the ISPOR-SMDM Modeling Good Research Practices Task Force-5
,
2012,
Value in Health.
[15]
J. Habbema,et al.
Using Mathematical Models to Inform Public Policy for Cancer Prevention and Screening
,
2013
.
[16]
Rongwei Fu,et al.
Screening for Colorectal Cancer: A Targeted, Updated Systematic Review for the U.S. Preventive Services Task Force
,
2008,
Annals of Internal Medicine.
[17]
Jonathan Karnon,et al.
Model Parameter Estimation and Uncertainty Analysis
,
2012,
Medical decision making : an international journal of the Society for Medical Decision Making.
[18]
M. Mcgrath.
Cost Effectiveness in Health and Medicine.
,
1998
.
[19]
Timothy J Wilt,et al.
Screening for breast cancer: U.S. Preventive Services Task Force recommendation statement.
,
2009,
Annals of internal medicine.
[20]
Marvin Zelen,et al.
Effects of Mammography Screening Under Different Screening Schedules: Model Estimates of Potential Benefits and Harms
,
2009
.
[21]
J. Habbema,et al.
A novel hypothesis on the sensitivity of the fecal occult blood test
,
2009,
Cancer.
[22]
D. Berry,et al.
Breast Cancer Working Group of the Cancer Intervention and Surveillance Modeling Network. Effects of mammography screening under different screening schedules: Model estimates of potential benefits and harms (Annals of Internal Medicine (2009) 151, (738-747))
,
2010
.
[23]
H. Nelson,et al.
Screening for Breast Cancer: An Update for the U.S. Preventive Services Task Force
,
2009,
Annals of Internal Medicine.
[24]
S Kamen,et al.
The task force.
,
1976,
Journal of hospital dental practice.
[25]
Milton C Weinstein,et al.
Principles of good practice for decision analytic modeling in health-care evaluation: report of the ISPOR Task Force on Good Research Practices--Modeling Studies.
,
2003,
Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.
[26]
Jonathan Karnon,et al.
Modeling using discrete event simulation: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force--4.
,
2012,
Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.
[27]
D. Owens,et al.
State-transition modeling: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force--3.
,
2012,
Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.
[28]
Murray Krahn,et al.
Conceptualizing a Model
,
2012,
Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.
[29]
Bernadette Mazurek Melnyk,et al.
Screening for colorectal cancer: U.S. Preventive Services Task Force recommendation statement.
,
2008,
Annals of internal medicine.
[30]
V. Moyer,et al.
Screening for Cervical Cancer: U.S. Preventive Services Task Force Recommendation Statement
,
2012,
Annals of Internal Medicine.
[31]
Eric J Feuer,et al.
Modeling cancer natural history, epidemiology, and control: reflections on the CISNET breast group experience.
,
2006,
Journal of the National Cancer Institute. Monographs.
[32]
Ann G Zauber,et al.
Individualizing colonoscopy screening by sex and race.
,
2009,
Gastrointestinal endoscopy.
[33]
A. Miller.
New data on prostate-cancer mortality after PSA screening.
,
2012,
The New England journal of medicine.
[34]
D. Habbema,et al.
Cervical cancer screening in the United States and the Netherlands: a tale of two countries.
,
2012,
The Milbank quarterly.
[35]
L. Bisanti,et al.
Once-only sigmoidoscopy in colorectal cancer screening: follow-up findings of the Italian Randomized Controlled Trial--SCORE.
,
2011,
Journal of the National Cancer Institute.
[36]
S. Pearson,et al.
Cost consideration in the clinical guidance documents of physician specialty societies in the United States.
,
2013,
JAMA internal medicine.
[37]
Murray Krahn,et al.
Conceptualizing a model: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force--2.
,
2012,
Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.
[38]
T. Wilt,et al.
Screening for Breast Cancer: U.S. Preventive Services Task Force Recommendation Statement
,
2011
.