Learning to Identify Patients at Risk of Uncontrolled Hypertension Using Electronic Health Records Data

Hypertension is a major risk factor for stroke, cardiovascular disease, and end-stage renal disease, and its prevalence is expected to rise dramatically. Effective hypertension management is thus critical. A particular priority is decreasing the incidence of uncontrolled hypertension. Early identification of patients at risk for uncontrolled hypertension would allow targeted use of personalized, proactive treatments. We develop machine learning models (logistic regression and recurrent neural networks) to stratify patients with respect to the risk of exhibiting uncontrolled hypertension within the coming three-month period. We trained and tested models using EHR data from 14,407 and 3,009 patients, respectively. The best model achieved an AUROC of 0.719, outperforming the simple, competitive baseline of relying prediction based on the last BP measure alone (0.634). Perhaps surprisingly, recurrent neural networks did not outperform a simple logistic regression for this task, suggesting that linear models should be included as strong baselines for predictive tasks using EHR.

[1]  Jeffrey Dean,et al.  Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.

[2]  M. Varacallo,et al.  Accountable Care Organization (ACO) , 2017 .

[3]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[4]  David C. Kale,et al.  Modeling Missing Data in Clinical Time Series with RNNs , 2016 .

[5]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[6]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Jimeng Sun,et al.  Predicting changes in hypertension control using electronic health records from a chronic disease management program , 2014, J. Am. Medical Informatics Assoc..

[9]  Carla E. Brodley,et al.  Class Imbalance, Redux , 2011, 2011 IEEE 11th International Conference on Data Mining.

[10]  Loida D Nguyen,et al.  Hypertension management: an update. , 2010, American health & drug benefits.

[11]  M. Kenward,et al.  Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls , 2009, BMJ : British Medical Journal.

[12]  S. Kotsiantis Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[13]  P. Whelton,et al.  Long-term absolute benefit of lowering blood pressure in hypertensive patients according to the JNC VI risk stratification. , 2000, Hypertension.

[14]  W. Kannel,et al.  Risk stratification in hypertension: new insights from the Framingham Study. , 2000, American journal of hypertension.

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.