Key Summary Points Computerized clinical decision support that is integrated in the physician order entry system of an electronic health record can help improve the appropriate ordering of diagnostic imaging studies. Interventions that include a hard stop to prevent clinicians from ordering imaging tests classified as inappropriate and interventions in an integrated care delivery setting may improve effectiveness. The potential harms of computerized clinical decision-support interventions have been rarely studied. Concern that the costs of health care are increasing at unsustainable rates is widespread. One driver of cost is the increasing use of radiologic imaging procedures, particularly advanced imaging techniques, such as computed tomography (CT) and magnetic resonance imaging (MRI). For example, the use of CT scans in the emergency department (ED) increased by 330% from 1996 to 2007, a time when the rate of ED visits increased by only 11% (1). Other investigators reported a 3-fold increase in the likelihood of having CT or MRI during an ED visit between 1998 and 2007 (2). The increases in imaging studies have led to closer scrutiny of their clinical value. In some cases, strong evidence shows that they provide no value or may even harm patients. For example, a meta-analysis of early lumbar imaging for patients with acute low back pain included 5 randomized, controlled trials in which patients were randomly assigned to receive or not receive early imaging in the form of a plain film, CT, or MRI. At 3 months, patients who had received imaging had no improvement in pain or function (3). In other situations, strong professional opinion considers certain tests to have little value, mainly because alternate tests are preferred or the probability of an abnormal image is exceedingly remote. When the American Board of Internal Medicine Foundation asked physician specialty groups to identify procedures or tests that they judged to have little value (the Choosing Wisely campaign), imaging tests, such as CT for minor head injury in the ED (American College of Emergency Physicians), imaging studies in patients with nonspecific low back pain (American College of Physicians), imaging for uncomplicated headache (American College of Radiology), CT angiography for patients with low clinical probability of pulmonary embolism and a negative d-dimer assay result (American College of Chest Physicians), and cardiac stress imaging in patients without high-risk markers for coronary artery disease (American College of Cardiology), were frequently cited. The recognition that more appropriate use of imaging could improve quality and reduce costs has led to the development of interventions to encourage more appropriate radiology use. Some of these interventions have made use of the computerized clinical decision-support (CCDS) capabilities of electronic health records (EHRs). Given that adoption of EHRs is expanding, we undertook a systematic review and meta-analysis of EHR-based interventions to improve the appropriateness of diagnostic imaging. This work was performed for the Veterans Health Administration Choosing Wisely Workgroup. The key questions are as follows: What is the effectiveness of EHR-based interventions in reducing unnecessary or inappropriate diagnostic imaging, and does effectiveness vary in results by system? What are the harms or potential harms associated with EHR-based interventions used to reduce inappropriate imaging? Methods This systematic review is reported according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (4). A formal protocol was developed and submitted to PROSPERO (CRD42014007469). The protocol and preliminary results received input from a technical expert panel. Data Sources and Searches We began with a search of all studies included in 4 previous broad-based reviews of health information technology (IT) (58). These reviews were done using similar search strategies (between 1995 and 2013) and inclusion or exclusion criteria and were designed to identify all published hypothesis-testing studies of clinical health IT. Hypothesis-testing studies included randomized trials and controlled before-and-after, time-series, and prepost studies. In the original reviews, these studies were further classified according to the health IT functionality. Recently published summary data showed that 417 of 1057 health IT studies published from 1995 to 2013 were classified as CCDS (8). These 417 titles and abstracts were searched for studies eligible for this review (for example, CCDS aimed at improving appropriateness of diagnostic imaging use). We next searched PubMed from 2011 to 10 September 2014 looking specifically at decision support for imaging and conducted Web of Science and PubMed searches of key references (see Appendix Table 1). Appendix Table 1. Search Strategy We also reference-mined 3 potentially relevant systematic reviews on computerized physician order entry and medical imaging (9), CCDS for chronic disease management (10), and CCDS with potential for inpatient cost reduction (11). Study Selection All reference titles and abstracts were screened in duplicate. Full-text articles were then reviewed in duplicate, and all discrepancies were discussed with the group. Inclusion criteria began with the enrollment of an adult population. Studies aimed only at children were excluded. Studies with mixed populations were included. Interventions needed to be EHR-based and intended to reduce imaging for diagnostic purposes considered inappropriate or unnecessary on the basis of clinical guidelines. Because we judged increasing appropriate use to be conceptually related to decreasing inappropriate use, we also included studies measuring this outcome. Screening radiologic studies, such as interventions to increase the use of radiographic imaging (for example, mammography) for breast cancer screening, were excluded. Studies of systems running on personal digital assistants were excluded. Studies of Web-based interventions or computerized, stand-alone systems that we judged could be easily incorporated into the EHR were included. All comparison groups were included. Outcomes needed to be the rates of imaging procedures judged as appropriate or inappropriate on the basis of existing clinical guidelines or locally developed guidelines. Studies that targeted the use of imaging procedures stated as being overused and then reporting only data on use were also included. Studies in all settings were included. Country of origin was not an exclusion criterion. Data Extraction and Quality Assessment Data were extracted by 2 reviewers, and discrepancies were reconciled with the group. Articles had data abstracted on study design, time period, setting, imaging method, intervention, comparison, sample size, target of intervention, findings, IT design, data entry for intervention, and implementation characteristics. Interventions were classified regardless of whether they were integrated with computerized physician order entry, gave real-time feedback at the point of care, suggested a recommended cause of action, had a stop that had to be justified or overridden, were automated through the EHR, or required clinical staff to enter data specifically for the intervention. We assessed the development by means of iterative testing or pilot tests, clinician or user training, use of audit and feedback (or other internal incentives), and whether the authors of the study had also developed the intervention. We assessed the quality of studies by their design (randomized vs. observational) and the degree to which they reported information about the intervention and implementation characteristics listed above. Data Synthesis and Analysis The effect of the intervention on appropriateness was the primary outcome. Studies measuring an increase in appropriateness were considered along with those measuring a decrease in inappropriate use. The effect on use was a secondary outcome. For studies presenting count data or for which a count could be calculated (from a percentage), an odds ratio and associated SE were calculated. For comparability, the log odds ratios and their SEs were converted into Cohen d effect sizes using Stata SE, version 10 (StataCorp) (12, 13). For studies presenting means and measures of variation, Cohen d effect sizes were calculated directly. For each study, we used the difference between the periods before and after intervention, the difference between the time-series projection of performance in the absence of the intervention and the actual performance during the intervention, or the difference between providers randomly assigned to the intervention or control, as appropriate to the study design and the available data. Results were converted to effect sizes for the analysis. Random-effects meta-analyses were conducted and results were pooled using the HartungKnappSidikJonkman variance estimator (14). After collecting data on the interventions, implementations, and settings but before extracting outcomes data, we developed 4 hypotheses about effectiveness of the intervention1 in each category of intervention characteristics, settings, implementation, and target. First, interventions will vary in effectiveness according to the following rank order: interventions that present only information (A interventions), interventions that include a pop-up or reminder that the selected radiographic examination does not meet current guidelines (B interventions), interventions that require an active override for providers to continue to order a radiographic examination that is not supported by guidelines (that is, soft stop) (C interventions), or interventions that forbid providers from ordering a radiographic examination that is not supported by guidelines unless or until consultation with a peer or an expert (that is, hard stop) (D interventions). Second, interventions will be more effective in s