Temporal sequence alignment in electronic health records for computable patient representation

Constructing patient representation from EHRs has become an emerging hot research topic, as it is widely used for predicting disease prognosis, medication outcomes and mortality, and identifying patients who are similar to a target patient. Sequence alignment methods are able to preserve the temporal sequence information in patient medical records when constructing computable patient representation and thus are worth comprehensive and objective evaluation. In this work, we synthesized patient medical records using a set of synthesis operations on top of real patient medical records from a large real-world EHR database. Then we tested two cutting-edge sequence alignment methods, namely dynamic time warping (DTW) and Needleman-Wunsch algorithm (NWA) for the purpose of patient medical records alignment, in order to understand their strengths and limitations. Our results show that both DTW and NWA outperform the reference alignment. DTW seems to align better than NWA by inserting new daily events and identifying more similarities between patient medical records. By incorporating medical knowledge, we can improve the temporal sequence alignment by these algorithms even better and create more accurate patient representation for predictive models and patient similarity calculation.

[1]  James A. Evans,et al.  Health ROI as a measure of misalignment of biomedical needs and resources , 2015, Nature Biotechnology.

[2]  Scott M. Brue,et al.  Data resource profile: the Rochester Epidemiology Project (REP) medical records-linkage system. , 2012, International journal of epidemiology.

[3]  Svetha Venkatesh,et al.  Learning vector representation of medical objects via EMR-driven nonnegative restricted Boltzmann machines (eNRBM) , 2015, J. Biomed. Informatics.

[4]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[5]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[6]  Anis Sharafoddini,et al.  Patient Similarity in Prediction Models Based on Health Data: A Scoping Review , 2017, JMIR medical informatics.

[7]  Jimeng Sun,et al.  Multi-layer Representation Learning for Medical Concepts , 2016, KDD.

[8]  Fei Wang,et al.  An RNN Architecture with Dynamic Temporal Matching for Personalized Predictions of Parkinson's Disease , 2017, SDM.

[9]  Scott M. Brue,et al.  Data Resource Profile: Expansion of the Rochester Epidemiology Project medical records-linkage system (E-REP). , 2018, International journal of epidemiology.

[10]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[11]  Riccardo Bellazzi,et al.  Patient similarity for precision medicine: A systematic review , 2018, J. Biomed. Informatics.

[12]  R G Mark,et al.  MIMIC II: a massive temporal ICU patient database to support research in intelligent patient monitoring , 2002, Computers in Cardiology.

[13]  Louis J. Gross Algorithms in Bioinformatics: A Practical Introduction , 2009 .

[14]  N. Cox,et al.  Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record , 2017, PloS one.

[15]  Jimeng Sun,et al.  Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review , 2018, J. Am. Medical Informatics Assoc..

[16]  Melissa A. Basford,et al.  Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data , 2013, Nature Biotechnology.

[17]  Ferran Sanz,et al.  Identifying temporal patterns in patient disease trajectories using dynamic time warping: A population-based study , 2018, Scientific Reports.

[18]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[19]  Sherry‐Ann Brown Patient Similarity: Emerging Concepts in Systems and Precision Medicine , 2016, Front. Physiol..

[20]  B. Yawn,et al.  American Journal of Epidemiology Practice of Epidemiology Use of a Medical Records Linkage System to Enumerate a Dynamic Population over Time: the Rochester Epidemiology Project , 2022 .

[21]  Hongfang Liu,et al.  Temporal Pattern and Association Discovery of Diagnosis Codes Using Deep Learning , 2015, 2015 International Conference on Healthcare Informatics.

[22]  Alan Bundy,et al.  Dynamic Time Warping , 1984 .

[23]  Joon Lee,et al.  Personalized Mortality Prediction Driven by Electronic Medical Data and a Patient Similarity Metric , 2015, PloS one.

[24]  Maryam Zolnoori,et al.  Public Opinions Toward Diseases: Infodemiological Study on News Media Data , 2018, Journal of medical Internet research.