A Machine Learning-Based Applied Prediction Model for Identification of Acute Coronary Syndrome (ACS) Outcomes and Mortality in Patients during the Hospital Stay

Nowadays, machine learning (ML) is a revolutionary and cutting-edge technology widely used in the medical domain and health informatics in the diagnosis and prognosis of cardiovascular diseases especially. Therefore, we propose a ML-based soft-voting ensemble classifier (SVEC) for the predictive modeling of acute coronary syndrome (ACS) outcomes such as STEMI and NSTEMI, discharge reasons for the patients admitted in the hospitals, and death types for the affected patients during the hospital stay. We used the Korea Acute Myocardial Infarction Registry (KAMIR-NIH) dataset, which has 13,104 patients’ data containing 551 features. After data extraction and preprocessing, we used the 125 useful features and applied the SMOTETomek hybrid sampling technique to oversample the data imbalance of minority classes. Our proposed SVEC applied three ML algorithms, such as random forest, extra tree, and the gradient-boosting machine for predictive modeling of our target variables, and compared with the performances of all base classifiers. The experiments showed that the SVEC outperformed other ML-based predictive models in accuracy (99.0733%), precision (99.0742%), recall (99.0734%), F1-score (99.9719%), and the area under the ROC curve (AUC) (99.9702%). Overall, the performance of the SVEC was better than other applied models, but the AUC was slightly lower than the extra tree classifier for the predictive modeling of ACS outcomes. The proposed predictive model outperformed other ML-based models; hence it can be used practically in hospitals for the diagnosis and prediction of heart problems so that timely detection of proper treatments can be chosen, and the occurrence of disease predicted more accurately.

[1]  U. Waqas,et al.  CardioNet: Automatic Semantic Segmentation to Calculate the Cardiothoracic Ratio for Cardiomegaly and Other Chest Diseases , 2022, Journal of personalized medicine.

[2]  Syed Waseem Abbas Sherazi,et al.  A soft voting ensemble classifier for early prediction and diagnosis of occurrences of major adverse cardiovascular events for STEMI and NSTEMI during 2-year follow-up in patients with acute coronary syndrome , 2021, PloS one.

[3]  Walayat Hussain,et al.  Trends in Using IoT with Machine Learning in Health Prediction System , 2021, Forecasting.

[4]  Jaime Fern'andez del R'io,et al.  Array programming with NumPy , 2020, Nature.

[5]  Moon Hyun Jae,et al.  A machine learning–based 1-year mortality prediction model after hospital discharge for clinical patients with acute coronary syndrome , 2019, Health Informatics J..

[6]  John T. Hancock,et al.  Survey on categorical data for neural networks , 2020, Journal of Big Data.

[7]  K. Moons,et al.  Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness , 2020, BMJ.

[8]  Gary S Collins,et al.  Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness , 2020, BMJ.

[9]  Marinka Zitnik,et al.  Interpretability of machine learning‐based prediction models in healthcare , 2020, WIREs Data Mining Knowl. Discov..

[10]  A. Lakshmanarao,et al.  Machine Learning Techniques For Heart Disease Prediction , 2019 .

[11]  S. Ullah,et al.  Cardiovascular Disease Prediction System Using Extra Trees Classifier , 2019 .

[12]  S. Kimmel,et al.  Predictive risk stratification using HEART (history, electrocardiogram, age, risk factors, and initial troponin) and TIMI (thrombolysis in myocardial infarction) scores in non-high risk chest pain patients , 2019, Medicine.

[13]  Kwai-Sang Chin,et al.  A universal deep learning approach for modeling the flow of patients under different severities , 2018, Comput. Methods Programs Biomed..

[14]  Kedar Potdar,et al.  A Comparative Study of Categorical Variable Encoding Techniques for Neural Network Classifiers , 2017 .

[15]  Debojyoti Dutta,et al.  A Study of Machine Learning in Healthcare , 2017, 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC).

[16]  J. Hippisley-Cox,et al.  Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study , 2017, British Medical Journal.

[17]  A. Hoes,et al.  Comparison of the GRACE, HEART and TIMI score to predict major adverse cardiac events in chest pain patients at the emergency department. , 2017, International journal of cardiology.

[18]  Yeshvendra K. Singh,et al.  Heart Disease Prediction System Using Random Forest , 2016 .

[19]  D. McManus,et al.  Performance of the GRACE Risk Score 2.0 Simplified Algorithm for Predicting 1-Year Death After Hospitalization for an Acute Coronary Syndrome in a Contemporary Multiracial Cohort. , 2016, The American journal of cardiology.

[20]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[21]  B. Goldstein,et al.  Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges , 2016, European heart journal.

[22]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[23]  Jim Lewsey,et al.  Medical Statistics: A Guide to Data Analysis and Critical Appraisal , 2015 .

[24]  O. Franco,et al.  Predictive Value of Updating Framingham Risk Scores with Novel Risk Markers in the U.S. General Population , 2014, PloS one.

[25]  Alois Knoll,et al.  Gradient boosting machines, a tutorial , 2013, Front. Neurorobot..

[26]  E. Antman,et al.  Dynamic TIMI Risk Score for STEMI , 2013, Journal of the American Heart Association.

[27]  Mohammad Khalilia,et al.  Predicting disease risks from highly imbalanced data using random forest , 2011, BMC Medical Informatics Decis. Mak..

[28]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[29]  Robert C Welsh,et al.  Validation of the Global Registry of Acute Coronary Event (GRACE) risk score for in-hospital mortality in patients with acute coronary syndrome in Canada. , 2009, American heart journal.

[30]  G. Fonarow,et al.  Predictors of in-hospital mortality in patients hospitalized for heart failure: insights from the Organized Program to Initiate Lifesaving Treatment in Hospitalized Patients with Heart Failure (OPTIMIZE-HF). , 2008, Journal of the American College of Cardiology.

[31]  A. Sheikh,et al.  Predicting cardiovascular risk in England and Wales: prospective derivation and validation of QRISK2 , 2008, BMJ : British Medical Journal.

[32]  M. Pencina,et al.  General Cardiovascular Risk Profile for Use in Primary Care: The Framingham Heart Study , 2008, Circulation.

[33]  Tom Fahey,et al.  Predictive accuracy of the Framingham coronary risk score in British men:prospective cohort study , 2003, BMJ : British Medical Journal.

[34]  L. Breiman Random Forests , 2001, Encyclopedia of Machine Learning and Data Mining.

[35]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[36]  E. Antman,et al.  The TIMI risk score for unstable angina/non-ST elevation MI: A method for prognostication and therapeutic decision making. , 2000, JAMA.

[37]  Huilin Zheng,et al.  A Stacking Ensemble Prediction Model for the Occurrences of Major Adverse Cardiovascular Events in Patients With Acute Coronary Syndrome on Imbalanced Data , 2021, IEEE Access.

[38]  S. Ullah,et al.  Cardiovascular Disease Prediction System Using Extra Trees Classi�er , 2021 .

[39]  Woong-Kee Loh,et al.  Artificial Intelligence-based Semantic Segmentation of Ocular Regions for Biometrics and Healthcare Applications , 2020, Computers, Materials & Continua.

[40]  C. Beulah Christalin Latha,et al.  Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques , 2019, Informatics in Medicine Unlocked.

[41]  Robert F. Riley,et al.  Cost analysis of the History, ECG, Age, Risk factors, and initial Troponin (HEART) Pathway randomized control trial☆,☆☆,★ , 2017, The American journal of emergency medicine.

[42]  Qiangwang A Hybrid Sampling SVM Approach to Imbalanced Data Classification , 2014 .

[43]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[44]  Ana L. C. Bazzan,et al.  Balancing Training Data for Automated Annotation of Keywords: a Case Study , 2003, WOB.