Transforming Two Decades of ePR Data to OMOP CDM for Clinical Research

This paper presents the extract-transform-and-load (ETL) process from the Electronic Patient Records (ePR) at the Heart Institute (InCor) to the OMOP Common Data Model (CDM) format. We describe the initial database characterization, relational source mappings, selection filters, data transformations and patient de-identification using the open-source OHDSI tools and SQL scripts. We evaluate the resulting InCor-CDM database by recreating the same patient cohort from a previous reference study (over the original data source) and comparing the cohorts' descriptive statistics and inclusion reports. The results exhibit that up to 91% of the reference patients were retrieved by our method from the ePR through InCor-CDM, with AUC=0.938. The results indicate that the method that we employed was able to produce a new database that was both consistent with the original data and in accordance to the OMOP CDM standard.

[1]  Marco Antonio Gutierrez,et al.  The Effectiveness of Statins in the Treatment of Cardiovascular Disease: Cross-Sectional Study with Paired Groups from Electronic Patient Records , 2013 .

[2]  Yu-Chuan Li,et al.  Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers , 2015, MedInfo.

[3]  P. V. Biron,et al.  The HL7 Clinical Document Architecture. , 2001, Journal of the American Medical Informatics Association : JAMIA.

[4]  D. Madigan,et al.  Evaluating the impact of database heterogeneity on observational study results. , 2013, American journal of epidemiology.

[5]  Marco Antonio Gutierrez,et al.  A method for cohort selection of cardiovascular disease records from an electronic health record system , 2016, Int. J. Medical Informatics.

[6]  Vijay V. Raghavan,et al.  Big Data: Promises and Problems , 2015, Computer.

[7]  Vivienne J. Zhu,et al.  Feasibility and utility of applications of the common data model to multiple, disparate observational health databases , 2015, J. Am. Medical Informatics Assoc..

[8]  Sérgio Shiguemi Furuie,et al.  Managing Medical Images and Clinical Information: InCor's Experience , 2007, IEEE Transactions on Information Technology in Biomedicine.

[9]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[10]  K. Pommerening,et al.  Secondary use of the EHR via pseudonymisation. , 2004, Studies in health technology and informatics.

[11]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[12]  Martijn J. Schuemie,et al.  Conversion and Data Quality Assessment of Electronic Health Record Data at a Korean Tertiary Teaching Hospital to a Common Data Model for Distributed Network Research , 2016, Healthcare informatics research.

[13]  David W. Bates,et al.  The use of health information technology in seven nations , 2008, Int. J. Medical Informatics.

[14]  Dipak Kalra,et al.  The openEHR Foundation. , 2005, Studies in health technology and informatics.