Long Short-Term Memory Recurrent Neural Networks for Multiple Diseases Risk Prediction by Leveraging Longitudinal Medical Records

Individuals suffer from chronic diseases without being identified in time, which brings lots of burden of disease to the society. This paper presents a multiple disease risk prediction method to systematically assess future disease risks for patients based on their longitudinal medical records. In this study, medical diagnoses based on International Classification of Diseases (ICD) are aggregated into different levels for prediction to meet the needs of different stakeholders. The proposed approach gets validated using two independent hospital medical datasets, which includes 7105 patients with 18, 893 patients and 4170 patients with 13, 124 visits, respectively. The initial analysis reveals a high variation in patients’ characteristics. The study demonstrates that recurrent neural network with long-short time memory units performs well in different levels of diagnosis aggregation. Especially, the results show that the developed model can be well applied to predicting future disease risks for patients, with the exact-match score of 98.90% and 95.12% using 3-digit ICD code aggregation, while 96.60% and 96.83% using 4-digit ICD code aggregation for these two datasets, respectively. Moreover, the approach can be developed as a reference tool for hospital information systems, enhancing patients’ healthcare management over time.

[1]  Soon Ae Chun,et al.  Collaborative and trajectory prediction models of medical conditions by mining patients' Social Data , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[2]  You Jin Kim,et al.  Highrisk Prediction from Electronic Medical Records via Deep Attention Networks , 2017, ArXiv.

[3]  Behrouz Minaei,et al.  Dynamic Recommendation: Disease Prediction and Prevention Using Recommender System , 2016 .

[4]  Svetha Venkatesh,et al.  Resset: A Recurrent Model for Sequence of Sets with Applications to Electronic Medical Records , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[5]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[6]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[7]  Robin G. Qiu Service Science: The Foundations of Service Engineering and Management , 2014 .

[8]  Lukás Burget,et al.  Strategies for training large scale neural network language models , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[9]  David Sontag,et al.  Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests , 2016, MLHC.

[10]  Daoqiang Zhang,et al.  Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease , 2012, NeuroImage.

[11]  Nitesh V. Chawla,et al.  Time to CARE: a collaborative engine for practical disease prediction , 2010, Data Mining and Knowledge Discovery.

[12]  Qi Ye,et al.  Using Node Identifiers and Community Prior for Graph-Based Classification , 2018, Data Science and Engineering.

[13]  Soon Ae Chun,et al.  A collaborative filtering approach to assess individual condition risk based on patients' social network data , 2014, BCB.

[14]  Andrea Montanari,et al.  Statistical analysis of a low cost method for multiple disease prediction , 2018, Statistical methods in medical research.

[15]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[16]  Chaoyang Zhang,et al.  Deep learning architectures for multi-label classification of intelligent health risk prediction , 2017, BMC Bioinformatics.

[17]  Francesco Folino,et al.  Combining Markov Models and Association Analysis for Disease Prediction , 2011, ITBAM.

[18]  Francesco Folino,et al.  Link Prediction Approaches for Disease Networks , 2012, ITBAM.

[19]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[20]  Francesco Folino,et al.  A comorbidity-based recommendation engine for disease prediction , 2010, 2010 IEEE 23rd International Symposium on Computer-Based Medical Systems (CBMS).

[21]  Qasem A. Al-Radaideh,et al.  A Multi-Label Classification Approach Based on Correlations Among Labels , 2015 .

[22]  G. Vadivu,et al.  A novel approach for disease comorbidity prediction using weighted association rule mining , 2019, Journal of Ambient Intelligence and Humanized Computing.

[23]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[24]  Cynthia Rudin,et al.  Bayesian Hierarchical Rule Modeling for Predicting Medical Conditions , 2012, 1206.6653.

[25]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[26]  Francesco Folino,et al.  A recommendation engine for disease prediction , 2015, Inf. Syst. E Bus. Manag..

[27]  Soon Ae Chun,et al.  Predicting Comorbid Conditions and Trajectories Using Social Health Records , 2016, IEEE Transactions on NanoBioscience.

[28]  Fenglong Ma,et al.  Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks , 2017, KDD.

[29]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[30]  Nitesh V. Chawla,et al.  An Ensemble Topic Model for Sharing Healthcare Data and Predicting Disease Risk , 2013, BCB.

[31]  Nitesh V. Chawla,et al.  MedCare: Leveraging Medication Similarity for Disease Prediction , 2016, 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[32]  Tingyan Wang,et al.  Predictive Modeling of the Progression of Alzheimer’s Disease with Recurrent Neural Networks , 2018, Scientific Reports.

[33]  Constantin F. Aliferis,et al.  A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis , 2004, Bioinform..

[34]  Trishul M. Chilimbi,et al.  Project Adam: Building an Efficient and Scalable Deep Learning Training System , 2014, OSDI.

[35]  Yan Liu,et al.  Benchmark of Deep Learning Models on Large Healthcare MIMIC Datasets , 2017, ArXiv.

[36]  Gita Reese Sukthankar,et al.  Multi-label relational neighbor classification using social context features , 2013, KDD.

[37]  Francesco Folino,et al.  A Comorbidity Network Approach to Predict Disease Risk , 2010, ITBAM.

[38]  Chaoyang Zhang,et al.  Multi-label classification for intelligent health risk prediction , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[39]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[40]  N. Chawla,et al.  A Network-Based Approach to Understanding and Predicting Diseases , 2009 .