Assessing electronic health record phenotypes against gold-standard diagnostic criteria for diabetes mellitus

Objective: We assessed the sensitivity and specificity of 8 electronic health record (EHR)-based phenotypes for diabetes mellitus against gold-standard American Diabetes Association (ADA) diagnostic criteria via chart review by clinical experts. Materials and Methods: We identified EHR-based diabetes phenotype definitions that were developed for various purposes by a variety of users, including academic medical centers, Medicare, the New York City Health Department, and pharmacy benefit managers. We applied these definitions to a sample of 173 503 patients with records in the Duke Health System Enterprise Data Warehouse and at least 1 visit over a 5-year period (2007–2011). Of these patients, 22 679 (13%) met the criteria of 1 or more of the selected diabetes phenotype definitions. A statistically balanced sample of these patients was selected for chart review by clinical experts to determine the presence or absence of type 2 diabetes in the sample. Results: The sensitivity (62–94%) and specificity (95–99%) of EHR-based type 2 diabetes phenotypes (compared with the gold standard ADA criteria via chart review) varied depending on the component criteria and timing of observations and measurements. Discussion and Conclusions: Researchers using EHR-based phenotype definitions should clearly specify the characteristics that comprise the definition, variations of ADA criteria, and how different phenotype definitions and components impact the patient populations retrieved and the intended application. Careful attention to phenotype definitions is critical if the promise of leveraging EHR data to improve individual and population health is to be fulfilled.

[1]  Jay R. Desai,et al.  Construction of a Multisite DataLink Using Electronic Health Records for the Identification, Surveillance, Prevention, and Management of Diabetes Mellitus: The SUPREME-DM Project , 2012, Preventing chronic disease.

[2]  S. Genuth,et al.  The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. , 1993, The New England journal of medicine.

[3]  M. Pepe The Statistical Evaluation of Medical Tests for Classification and Prediction , 2003 .

[4]  Hairong Yu,et al.  Data extraction from electronic health records - existing tools may be unreliable and potentially unsafe. , 2013, Australian family physician.

[5]  Thomas R Frieden,et al.  Tracking diabetes: New York City's A1C Registry. , 2009, The Milbank quarterly.

[6]  S. de Lusignan,et al.  Prevalence and characteristics in coding, classification and diagnosis of diabetes in primary care , 2013, Postgraduate Medical Journal.

[7]  Jay R. Desai,et al.  Trends in diabetes incidence among 7 million insured adults, 2006-2011: the SUPREME-DM project. , 2015, American journal of epidemiology.

[8]  Plamen Nikolov,et al.  Economic Costs of Diabetes in the U.S. in 2002 , 2003, Diabetes care.

[9]  D. Boyle,et al.  The diabetes audit and research in Tayside Scotland (darts) study: electronic record linkage to create a diabetes register , 1997, BMJ.

[10]  Shelley A. Rusincovitch,et al.  A comparison of phenotype definitions for diabetes mellitus. , 2013, Journal of the American Medical Informatics Association : JAMIA.

[11]  R A Greenes,et al.  Assessment of diagnostic tests when disease verification is subject to selection bias. , 1983, Biometrics.

[12]  Sarah M. Greene,et al.  The role of research in integrated healthcare systems: the HMO Research Network. , 2004, The American journal of managed care.

[13]  Emma White,et al.  Registry-based diabetes risk detection schema for the systematic identification of patients at risk for diabetes in West Virginia primary care centers. , 2013, Perspectives in health information management.

[14]  A. Vickers,et al.  Statistical methods to correct for verification bias in diagnostic studies are inadequate when there are few false negatives: a simulation study , 2008, BMC medical research methodology.

[15]  Marie Lynn Miranda,et al.  Methods and initial findings from the Durham Diabetes Coalition: Integrating geospatial health technology and community interventions to reduce death and disability , 2015, Journal of clinical & translational endocrinology.

[16]  S. Vinker,et al.  Usefulness of electronic databases for the detection of unrecognized diabetic patients , 2003, Cardiovascular diabetology.

[17]  R. Platt,et al.  Automated Detection and Classification of Type 1 Versus Type 2 Diabetes Using Electronic Health Record Data , 2013, Diabetes Care.

[18]  Stefan Schulz,et al.  Checking Coding Completeness by Mining Discharge Summaries , 2011, MIE.

[19]  L. Kux OF HEALTH AND HUMAN SERVICES Food and Drug Administration , 2014 .

[20]  Victor R. Preedy,et al.  Behavioral Risk Factor Surveillance System , 2010 .

[21]  A. Majeed,et al.  Identifying undiagnosed diabetes: cross-sectional survey of 3.6 million patients' electronic records. , 2008, The British journal of general practice : the journal of the Royal College of General Practitioners.

[22]  James J Arnzen,et al.  Towards Automatic Diabetes Case Detection and ABCS Protocol Compliance Assessment , 2012, Clinical Medicine & Research.

[23]  Carl van Walraven,et al.  The accuracy of using integrated electronic health care data to identify patients with undiagnosed diabetes mellitus. , 2012, Journal of evaluation in clinical practice.

[24]  E. Ewen,et al.  Electronic health record use to classify patients with newly diagnosed versus preexisting type 2 diabetes: infrastructure for comparative effectiveness research and population health management. , 2012, Population health management.

[25]  Jay R. Desai,et al.  Diabetes and Asthma Case Identification, Validation, and Representativeness When Using Electronic Health Data to Construct Registries for Comparative Effectiveness and Epidemiologic Research , 2012, Medical care.