Effectiveness of LSTMs in Predicting Congestive Heart Failure Onset

In this paper we present a Recurrent neural networks (RNN) based architecture that achieves an AUCROC of 0.9147 for predicting the onset of Congestive Heart Failure (CHF) 15 months in advance using a 12-month observation window on a large cohort of 216,394 patients. We believe this to be the largest study in CHF onset prediction with respect to the number of CHF case patients in the cohort and the test set (3,332 CHF patients) on which the AUC metrics are reported. We explore the extent to which LSTM (Long Short Term Memory) based model, a variant of RNNs, can accurately predict the onset of CHF when compared to known linear baselines like Logistic Regression, Random Forests and deep learning based models such as Multi-Layer Perceptron and Convolutional Neural Networks. We utilize demographics, medical diagnosis and procedure data from 21,405 CHF and 194,989 control patients to as our features. We describe our feature embedding strategy for medical diagnosis codes that accommodates the sparse, irregular, longitudinal, and high-dimensional characteristics of EHR data. We empirically show that LSTMs can capture the longitudinal aspects of EHR data better than the proposed baselines. As an attempt to interpret the model, we present a temporal data analysis-based technique on false positives to attribute feature importance. A model capable of predicting the onset of congestive heart failure months in the future with this level of accuracy and precision can support efforts of practitioners to implement risk factor reduction strategies and researchers to begin to systematically evaluate interventions to potentially delay or avert development of the disease with high mortality, morbidity and significant costs.

[1]  Jason Roy,et al.  Prediction Modeling Using EHR Data: Challenges, Strategies, and a Comparison of Machine Learning Approaches , 2010, Medical care.

[2]  S. Yusuf,et al.  Effects of an angiotensin-converting-enzyme inhibitor, ramipril, on cardiovascular events in high-risk patients. The Heart Outcomes Prevention Evaluation Study Investigators. , 2000 .

[3]  David C. Kale,et al.  Modeling Missing Data in Clinical Time Series with RNNs , 2016 .

[4]  A. Ciampi,et al.  Additive beneficial effects of beta-blockers to angiotensin-converting enzyme inhibitors in the Survival and Ventricular Enlargement (SAVE) Study. SAVE Investigators. , 1997, Journal of the American College of Cardiology.

[5]  S. Yusuf,et al.  Effect of enalapril on mortality and the development of heart failure in asymptomatic patients with reduced left ventricular ejection fractions. , 1992, The New England journal of medicine.

[6]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[7]  C. Reid,et al.  Risk Prediction Models for Incident Heart Failure: A Systematic Review of Methodology and Model Performance. , 2017, Journal of cardiac failure.

[8]  E. Antman,et al.  ACC/AHA PRACTICE GUIDELINES ACC/AHA Guidelines for the Evaluation and Management of Chronic Heart Failure in the Adult: Executive Summary , 2002 .

[9]  D. Mozaffarian,et al.  Defining and Setting National Goals for Cardiovascular Health Promotion and Disease Reduction: The American Heart Association's Strategic Impact Goal Through 2020 and Beyond , 2010, Circulation.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  P. Wilson,et al.  Risk Factors for Cardiovascular Disease and the Framingham Study Equation , 2006 .

[12]  Peter P. Liu,et al.  Early detection of myocardial dysfunction and heart failure , 2010, Nature Reviews Cardiology.

[13]  Shahram Ebadollahi,et al.  Early detection of heart failure with varying prediction windows by structured and unstructured data in electronic health records , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[14]  A. Little,et al.  American Heart Association: Council on Arteriosclerosis , 1960 .

[15]  Jimeng Sun,et al.  Using recurrent neural network models for early detection of heart failure onset , 2016, J. Am. Medical Informatics Assoc..

[16]  Ping Zhang,et al.  Risk Prediction with Electronic Health Records: A Deep Learning Approach , 2016, SDM.

[17]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[18]  Gerasimos S Filippatos,et al.  2017 ACC/AHA/HFSA Focused Update of the 2013 ACCF/AHA Guideline for the Management of Heart Failure: A Report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines and the Heart Failure Society of America. , 2017, Journal of cardiac failure.

[19]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Johann Steurer,et al.  Barriers to apply cardiovascular prediction rules in primary care: a postal survey , 2007, BMC family practice.

[22]  D. Levy,et al.  Natural History of Asymptomatic Left Ventricular Systolic Dysfunction in the Community , 2003, Circulation.

[23]  P. Wilson,et al.  Incident Heart Failure Prediction in the ElderlyCLINICAL PERSPECTIVE , 2008 .

[24]  M. Drazner,et al.  2013 ACCF/AHA guideline for the management of heart failure: a report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines. , 2013, Journal of the American College of Cardiology.

[25]  M. Fornage,et al.  Heart Disease and Stroke Statistics—2017 Update: A Report From the American Heart Association , 2017, Circulation.

[26]  Kenney Ng,et al.  Early Detection of Heart Failure Using Electronic Health Records: Practical Implications for Time Before Diagnosis, Data Diversity, Data Quantity, and Data Density , 2016, Circulation. Cardiovascular quality and outcomes.

[27]  David Sontag,et al.  Multi-task Prediction of Disease Onsets from Longitudinal Lab Tests , 2016, ArXiv.

[28]  Furio Colivicchi,et al.  Additive beneficial effects of beta blockers in the prevention of symptomatic heart failure. , 2009, Monaldi archives for chest disease = Archivio Monaldi per le malattie del torace.

[29]  D. Levy,et al.  Multimarker Approach for the Prediction of Heart Failure Incidence in the Community , 2010, Circulation.

[30]  Olle Melander,et al.  Assessment of conventional cardiovascular risk factors and multiple biomarkers for the prediction of incident heart failure and atrial fibrillation. , 2010, Journal of the American College of Cardiology.

[31]  S. Russell,et al.  Prediction of Incident Heart Failure in General Practice: The ARIC Study , 2012 .