The impact of standardizing the definition of visits on the consistency of multi-database observational health research

BackgroundUse of administrative claims from multiple sources for research purposes is challenged by the lack of consistency in the structure of the underlying data and definition of data across claims data providers. This paper evaluates the impact of applying a standardized revenue code-based logic for defining inpatient encounters across two different claims databases.MethodsWe selected members who had complete enrollment in 2012 from the Truven MarketScan Commercial Claims and Encounters (CCAE) and the Optum Clinformatics (Optum) databases. The overall prevalence of inpatient conditions in the raw data was compared to that in the common data model (CDM) with the standardized visit definition applied.ResultsIn CCAE, 87.18% of claims from 2012 that were classified as part of inpatient visits in the raw data were also classified as part of inpatient visits after the data were standardized to CDM, and this overlap was consistent from 2006 to 2011. In contrast, Optum had 83.18% concordance in classification of 2012 claims from inpatient encounters before and after standardization, but the consistency varied over time. The re-classification of inpatient encounters substantially impacted the observed prevalence of medical conditions occurring in the inpatient setting and the consistency in prevalence estimates between the databases. On average, before standardization, each condition in Optum was 12% more prevalent than that same condition in CCAE; after standardization, the prevalence of conditions had a mean difference of only 1% between databases. Amongst 7,039 conditions reviewed, the difference in the prevalence of 67% of conditions in these two databases was reduced after standardization.ConclusionsIn an effort to improve consistency in research results across database one should review sources of database heterogeneity, such as the way data holders process raw claims data. Our study showed that applying the Observational Medical Outcomes Partnership (OMOP) CDM with a standardized approach for defining inpatient visits during the extract, transfer, and load process can decrease the heterogeneity observed in disease prevalence estimates across two different claims data sources.

[1]  J. Dalton,et al.  A unified approach to measuring the effect size between two groups using SAS , 2012 .

[2]  M. Schuemie,et al.  Combining electronic healthcare databases in Europe to allow for large‐scale drug safety monitoring: the EU‐ADR Project , 2011, Pharmacoepidemiology and drug safety.

[3]  J. Overhage,et al.  Advancing the Science for Active Surveillance: Rationale and Design for the Observational Medical Outcomes Partnership , 2010, Annals of Internal Medicine.

[4]  D. Madigan,et al.  Evaluating the impact of database heterogeneity on observational study results. , 2013, American journal of epidemiology.

[5]  Patrick B. Ryan,et al.  Validation of a common data model for active safety surveillance research , 2012, J. Am. Medical Informatics Assoc..

[6]  J. Avorn,et al.  A review of uses of health care utilization databases for epidemiologic research on therapeutics. , 2005, Journal of clinical epidemiology.

[7]  Sarah M. Greene,et al.  The role of research in integrated healthcare systems: the HMO Research Network. , 2004, The American journal of managed care.

[8]  Roy Pardee,et al.  The National Patient-Centered Clinical Research Network (PCORnet) Bariatric Study Cohort: Rationale, Methods, and Baseline Characteristics , 2017, JMIR research protocols.

[9]  Marsha A Raebel,et al.  Design considerations, architecture, and use of the Mini‐Sentinel distributed data system , 2012, Pharmacoepidemiology and drug safety.

[10]  Keiji Fukuda,et al.  Influenza-associated hospitalizations in the United States. , 2004, JAMA.

[11]  M. Schuemie,et al.  Variation in Choice of Study Design: Findings from the Epidemiology Design Decision Inventory and Evaluation (EDDIE) Survey , 2013, Drug Safety.

[12]  Patrick B. Ryan,et al.  Evaluation of alternative standardized terminologies for medical conditions within a network of observational healthcare databases , 2012, J. Biomed. Informatics.

[13]  J. Rassen,et al.  Confounding Control in Healthcare Database Research: Challenges and Potential Approaches , 2010, Medical care.