Mining Time-Stamped Electronic Health Records with Referenced Sequences

Electronic Health Records (EHRs) are typically stored as time-stamped encounter records. Observing temporal relationship between medical records is an integral part of interpreting the information. Hence, statistical analysis of EHRs requires that clinically informed time-interdependent analysis variables (TIAV) be created. Often, formulation and creation of these variables are iterative and requiring custom codes. We describe a technique of using sequences of time-referenced entities as the building blocks for TIAVs. These sequences represent different aspects of patient's medical history in a contiguous fashion. To illustrate the principles and applications of the method, we provide examples using Veterans Health Administration's research databases. In the first example, sequences representing medication exposure were used to assess patient selection criteria for a treatment comparative effectiveness study. In the second example, sequences of Charlson Comorbidity conditions and clinical settings of inpatient or outpatient were used to create variables with which data anomalies and trends were revealed. The third example demonstrated the creation of an analysis variable derived from the temporal dependency of medication exposure and comorbidity. Complex time-interdependent analysis variables can be created from the sequences with simple, reusable codes, hence enable unscripted or automation of TIAV creation.

[1]  Arend Hintze,et al.  Data Preprocessing , 2017, Encyclopedia of Machine Learning and Data Mining.

[2]  George Hripcsak,et al.  Bias Associated with Mining Electronic Health Records , 2011, Journal of biomedical discovery and collaboration.

[3]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[4]  B. Shneiderman,et al.  Temporal Search and Replace : An Interactive Tool for the Analysis of Temporal Event Sequences , 2013 .

[5]  Yuval Shahar,et al.  Medical Temporal-Knowledge Discovery via Temporal Abstraction , 2009, AMIA.

[6]  D Kalra,et al.  Electronic health records: new opportunities for clinical research , 2013, Journal of internal medicine.

[7]  Vipin Kumar,et al.  Mining Electronic Health Records: A Survey , 2017, 1702.03222.

[8]  Jennifer P. Stevens,et al.  Using EHR to Conduct Outcome and Health Services Research , 2016 .

[9]  Ben Shneiderman,et al.  LifeLines: using visualization to enhance navigation and analysis of patient records , 1998, AMIA.

[10]  Hans-Peter Kriegel,et al.  Future trends in data mining , 2007, Data Mining and Knowledge Discovery.

[11]  N. Adler,et al.  Using Electronic Health Records for Population Health Research: A Review of Methods and Applications. , 2016, Annual review of public health.

[12]  Panagiotis Papapetrou,et al.  Learning from heterogeneous temporal data in electronic health records , 2017, J. Biomed. Informatics.

[13]  Fei Wang,et al.  Outcomes Prediction via Time Intervals Related Patterns , 2015, 2015 IEEE International Conference on Data Mining.

[14]  Carlo Combi,et al.  Querying temporal clinical databases on granular trends , 2012, J. Biomed. Informatics.

[15]  Andrew R. Post,et al.  Model Formulation: PROTEMPA: A Method for Specifying and Identifying Temporal Sequences in Retrospective Data for Patient Selection , 2007, J. Am. Medical Informatics Assoc..

[16]  David K. Vawdrey,et al.  HARVEST, a longitudinal patient record summarizer , 2014, J. Am. Medical Informatics Assoc..

[17]  I. Kohane,et al.  Biases in electronic health record data due to processes within the healthcare system: retrospective observational study , 2018, British Medical Journal.

[18]  J. Rumsfeld,et al.  Insights from advanced analytics at the Veterans Health Administration. , 2014, Health affairs.

[19]  Chunhua Weng,et al.  Facilitating biomedical researchers' interrogation of electronic health record data: Ideas from outside of biomedical informatics , 2016, J. Biomed. Informatics.

[20]  Yuval Shahar,et al.  A Framework for Knowledge-Based Temporal Abstraction , 1997, Artif. Intell..

[21]  Shuang Wang,et al.  Hidden in plain sight: bias towards sick patients when sampling patients with sufficient electronic health record data for research , 2014, BMC Medical Informatics and Decision Making.

[22]  Fleur Fritz,et al.  Electronic health records to facilitate clinical research , 2016, Clinical Research in Cardiology.

[23]  Patrick B. Ryan,et al.  Validation of a common data model for active safety surveillance research , 2012, J. Am. Medical Informatics Assoc..

[24]  Francisco Herrera,et al.  Data Preprocessing in Data Mining , 2014, Intelligent Systems Reference Library.

[25]  Isaac S. Kohane,et al.  Application of Information Technology: Temporal Expressiveness in Querying a Time-stamp - based Clinical Database , 2000, J. Am. Medical Informatics Assoc..

[26]  György J. Simon,et al.  TR 15-016 Mining Electronic Health Records ( EHR ) : A Survey , 2015 .

[27]  T. Murdoch,et al.  The inevitable application of big data to health care. , 2013, JAMA.

[28]  H. Quan,et al.  Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data , 2005, Medical care.

[29]  Ronen Feldman,et al.  The Data Mining and Knowledge Discovery Handbook , 2005 .