A Regularized Deep Learning Approach for Clinical Risk Prediction of Acute Coronary Syndrome Using Electronic Health Records

Objective: Acute coronary syndrome (ACS), as a common and severe cardiovascular disease, is a leading cause of death and the principal cause of serious long-term disability globally. Clinical risk prediction of ACS is important for early intervention and treatment. Existing ACS risk scoring models are based mainly on a small set of hand-picked risk factors and often dichotomize predictive variables to simplify the score calculation. Methods: This study develops a regularized stacked denoising autoencoder (SDAE) model to stratify clinical risks of ACS patients from a large volume of electronic health records (EHR). To capture characteristics of patients at similar risk levels, and preserve the discriminating information across different risk levels, two constraints are added on SDAE to make the reconstructed feature representations contain more risk information of patients, which contribute to a better clinical risk prediction result. Results: We validate our approach on a real clinical dataset consisting of 3464 ACS patient samples. The performance of our approach for predicting ACS risk remains robust and reaches 0.868 and 0.73 in terms of both AUC and accuracy, respectively. Conclusions: The obtained results show that the proposed approach achieves a competitive performance compared to state-of-the-art models in dealing with the clinical risk prediction problem. In addition, our approach can extract informative risk factors of ACS via a reconstructive learning strategy. Some of these extracted risk factors are not only consistent with existing medical domain knowledge, but also contain suggestive hypotheses that could be validated by further investigations in the medical domain.

[1]  Chueh-Loo Poh,et al.  A novel neural-inspired learning algorithm with application to clinical risk prediction , 2015, J. Biomed. Informatics.

[2]  Wei Huang,et al.  The expanded Global Registry of Acute Coronary Events: baseline characteristics, management practices, and hospital outcomes of patients with acute coronary syndromes. , 2009, American heart journal.

[3]  Svetha Venkatesh,et al.  Learning vector representation of medical objects via EMR-driven nonnegative restricted Boltzmann machines (eNRBM) , 2015, J. Biomed. Informatics.

[4]  I. Graham,et al.  Value and limitations of existing scores for the assessment of cardiovascular risk: a review for clinicians. , 2009, Journal of the American College of Cardiology.

[5]  Constantinos S. Pattichis,et al.  Assessment of the Risk Factors of Coronary Heart Events Based on Data Mining With Decision Trees , 2010, IEEE Transactions on Information Technology in Biomedicine.

[6]  Chunhua Weng,et al.  Facilitating biomedical researchers' interrogation of electronic health record data: Ideas from outside of biomedical informatics , 2016, J. Biomed. Informatics.

[7]  Paulo Carvalho,et al.  The CardioRisk project: Improvement of cardiovascular risk assessment , 2015, J. Comput. Sci..

[8]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[9]  E W Steyerberg,et al.  Predictors of outcome in patients with acute coronary syndromes without persistent ST-segment elevation. Results from an international trial of 9461 patients. The PURSUIT Investigators. , 2000, Circulation.

[10]  T. Thom,et al.  American Heart Association Statistics Committee and Stroke Statistics Subcommittee : Heart disease and stroke statistical-2006 update : A report from the American Heart Association Statistics Committee and Stroke statistics subcommittee , 2006 .

[11]  Cynthia Brandt,et al.  Semi-supervised clinical text classification with Laplacian SVMs: An application to cancer case management , 2013, J. Biomed. Informatics.

[12]  Naif Alajlan,et al.  Deep learning approach for active classification of electrocardiogram signals , 2016, Inf. Sci..

[13]  Huilong Duan,et al.  Predictive monitoring of clinical pathways , 2016, Expert Syst. Appl..

[14]  Paul Burton,et al.  Rivaroxaban in patients with a recent acute coronary syndrome. , 2012, The New England journal of medicine.

[15]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[16]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Huilong Duan,et al.  On mining latent treatment patterns from electronic medical records , 2015, Data Mining and Knowledge Discovery.

[18]  J Nikki McKoy,et al.  Systematic Review of Cardiovascular Disease Risk Assessment Tools , 2011 .

[19]  Geoffrey E. Hinton,et al.  Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.

[20]  Ting Chen,et al.  Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[22]  E. Antman,et al.  The TIMI risk score for unstable angina/non-ST elevation MI: A method for prognostication and therapeutic decision making. , 2000, JAMA.

[23]  Gabriele Eisenhauer Risk Stratification A Practical Guide For Clinicians , 2016 .

[24]  Jennifer G. Robinson,et al.  2013 ACC/AHA Guideline on the Assessment of Cardiovascular Risk: A Report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines , 2014, Circulation.

[25]  Huilong Duan,et al.  Utilizing Chinese Admission Records for MACE Prediction of Acute Coronary Syndrome , 2016, International journal of environmental research and public health.

[26]  Gediminas Adomavicius,et al.  Data mining for censored time-to-event data: a Bayesian network model for predicting cardiovascular risk from electronic health record data , 2014, Data Mining and Knowledge Discovery.

[27]  T. Ivanova,et al.  2016 European Guidelines on cardiovascular disease prevention in clinical practice , 2016 .

[28]  Jonathan M. Garibaldi,et al.  A hybrid model for automatic identification of risk factors for heart disease , 2015, J. Biomed. Informatics.

[29]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[30]  D. Levy,et al.  Prediction of coronary heart disease using risk factor categories. , 1998, Circulation.

[31]  Giulio Guagliumi,et al.  Comparative early and late outcomes after primary percutaneous coronary intervention in ST-segment elevation and non-ST-segment elevation acute myocardial infarction (from the CADILLAC trial). , 2006, The American journal of cardiology.

[32]  Manabu Torii,et al.  Risk factor detection for heart disease by applying text analytics in electronic medical records , 2015, J. Biomed. Informatics.

[33]  May D. Wang,et al.  –Omic and Electronic Health Record Big Data Analytics for Precision Medicine , 2017, IEEE Transactions on Biomedical Engineering.

[34]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[35]  Marc'Aurelio Ranzato,et al.  Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  N. Graham,et al.  Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation , 2002 .

[37]  Li Zhang,et al.  SD-MSAEs: Promoter recognition in human genome based on deep feature extraction , 2016, J. Biomed. Informatics.

[38]  Huilong Duan,et al.  A probabilistic topic model for clinical risk stratification from electronic health records , 2015, J. Biomed. Informatics.

[39]  Zhengxing Huang,et al.  MACE prediction of acute coronary syndrome via boosted resampling classification using electronic medical records , 2017, J. Biomed. Informatics.

[40]  Alan D. Lopez,et al.  Global mortality, disability, and the contribution of risk factors: Global Burden of Disease Study , 1997, The Lancet.

[41]  T Fahey,et al.  Accuracy and impact of risk assessment in the primary prevention of cardiovascular disease: a systematic review , 2006, Heart.

[42]  Aidong Zhang,et al.  Identifying informative risk factors and predicting bone disease progression via deep belief networks. , 2014, Methods.

[43]  Björn W. Schuller,et al.  Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition , 2014, IEEE Signal Processing Letters.

[44]  Matthew Clark,et al.  Prediction of clinical risks by analysis of preclinical and clinical adverse events , 2015, J. Biomed. Informatics.

[45]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[46]  Shah Ebrahim,et al.  European guidelines on cardiovascular disease prevention in clinical practice. Third Joint Task Force of European and Other Societies on Cardiovascular Disease Prevention in Clinical Practice. , 2003 .

[47]  Ping Zhang,et al.  Risk Prediction with Electronic Health Records: A Deep Learning Approach , 2016, SDM.

[48]  M. Drazner,et al.  2013 ACCF/AHA guideline for the management of heart failure: a report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines. , 2013, Journal of the American College of Cardiology.