Predicting 30-day hospital readmissions using artificial neural networks with medical code embedding

Reducing unplanned readmissions is a major focus of current hospital quality efforts. In order to avoid unfair penalization, administrators and policymakers use prediction models to adjust for the performance of hospitals from healthcare claims data. Regression-based models are a commonly utilized method for such risk-standardization across hospitals; however, these models often suffer in accuracy. In this study we, compare four prediction models for unplanned patient readmission for patients hospitalized with acute myocardial infarction (AMI), congestive health failure (HF), and pneumonia (PNA) within the Nationwide Readmissions Database in 2014. We evaluated hierarchical logistic regression and compared its performance with gradient boosting and two models that utilize artificial neural network. We show that unsupervised Global Vector for Word Representations embedding representations of administrative claims data combined with artificial neural network classification models significantly improves prediction of 30-day readmission. Our best models increased the AUC for prediction of 30-day readmissions from 0.68 to 0.72 for AMI, 0.60 to 0.64 for HF, and 0.63 to 0.68 for PNA compared to hierarchical logistic regression. Furthermore, risk-standardized hospital readmission rates calculated from our artificial neural network model that employed embeddings led to reclassification of approximately 10% of hospitals across categories of hospital performance. This finding suggests that prediction models that incorporate new methods classify hospitals differently than traditional regression-based approaches and that their role in assessing hospital performance warrants further investigation.

[1]  Amanda H. Salanitro,et al.  Risk prediction models for hospital readmission: a systematic review. , 2011, JAMA.

[2]  David Sontag,et al.  Learning Low-Dimensional Representations of Medical Concepts , 2016, CRI.

[3]  Jeffrey Dean,et al.  Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.

[4]  Amitabh Chandra,et al.  Readmission penalties and health insurance expansions: a dispatch from Massachusetts. , 2014, Journal of hospital medicine.

[5]  Volker Tresp,et al.  Exploiting Latent Embeddings of Nominal Clinical Data for Predicting Hospital Readmission , 2015, KI - Künstliche Intelligenz.

[6]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[7]  Joseph Futoma,et al.  A comparison of models for predicting early hospital readmissions , 2015, J. Biomed. Informatics.

[8]  M Pagano,et al.  Corrected group prognostic curves and summary statistics. , 1982, Journal of chronic diseases.

[9]  Dong Yu,et al.  Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[10]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[11]  Sara Rosenbaum,et al.  The Patient Protection and Affordable Care Act: Implications for Public Health Policy and Practice , 2011, Public health reports.

[12]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[13]  J. Lee,et al.  Covariance adjustment of rates based on the multiple logistic regression model. , 1981, Journal of chronic diseases.

[14]  Hrishikesh Chakraborty,et al.  Differences in Hospital Readmission Risk across All Payer Groups in South Carolina , 2017, Health services research.

[15]  Nikolaos Doulamis,et al.  Deep Learning for Computer Vision: A Brief Review , 2018, Comput. Intell. Neurosci..

[16]  Lauren M Wier,et al.  All-Cause Readmissions by Payer and Age, 2008 , 2011 .

[17]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[18]  Mark V. Williams,et al.  Rehospitalizations among patients in the Medicare fee-for-service program. , 2009, The New England journal of medicine.

[19]  Mark V. Williams,et al.  Rehospitalizations among patients in the Medicare fee-for-service program. , 2009, The New England journal of medicine.

[20]  Li Liang,et al.  Prediction of 30-Day All-Cause Readmissions in Patients Hospitalized for Heart Failure: Comparison of Machine Learning and Other Statistical Approaches , 2017, JAMA cardiology.

[21]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[22]  C. Steiner,et al.  Comorbidity measures for use with administrative data. , 1998, Medical care.

[23]  Lisa I. Iezzoni,et al.  Risk Adjustment of Medicare Capitation Payments Using the CMS-HCC Model , 2004, Health care financing review.

[24]  Nigam H. Shah,et al.  Building the graph of medicine from millions of clinical narratives , 2014, Scientific Data.

[25]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[26]  Harlan M. Krumholz,et al.  An Administrative Claims Measure Suitable for Profiling Hospital Performance Based on 30-Day All-Cause Readmission Rates Among Patients With Acute Myocardial Infarction , 2011, Circulation. Cardiovascular quality and outcomes.

[27]  Michael W Sjoding,et al.  Patterns of Readmissions for Three Common Conditions Among Younger US Adults , 2017, The American Journal of Medicine.

[28]  Ronan Collobert,et al.  Word Embeddings through Hellinger PCA , 2013, EACL.

[29]  Tianxi Cai,et al.  Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data , 2018, PSB.

[30]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[31]  Lauren M Wier,et al.  All-Cause Readmissions by Payer and Age, 2009–2013: Statistical Brief #199 , 2006 .

[32]  S. Negahban,et al.  Analysis of Machine Learning Techniques for Heart Failure Readmissions , 2016, Circulation. Cardiovascular quality and outcomes.

[33]  P. W. Lane,et al.  Analysis of covariance and standardization as instances of prediction. , 1982, Biometrics.