Multi-objective optimization and explanation for stroke risk assessment in Shanxi province

Stroke is the top leading causes of death in China (Zhou et al. The Lancet 2019). A dataset from Shanxi Province is used to identify the risk of each patient’s at four states low/medium/high/attack and provide the state transition tendency through a SHAP DeepExplainer. To improve the accuracy on an imbalance sample set, the Quadratic Interactive Deep Neural Network (QIDNN) model is first proposed by flexible selecting and appending of quadratic interactive features. The experimental results showed that the QIDNN model with 7 interactive features achieve the state-of-art accuracy 83.25%. Blood pressure, physical inactivity, smoking, weight and total cholesterol are the top five important features. Then, for the sake of high recall on the most urgent state, attack state, the stroke occurrence prediction is taken as an auxiliary objective to benefit from multi-objective optimization. The prediction accuracy was promoted, meanwhile the recall of the attack state was improved by 24.9% (to 84.83%) compared to QIDNN (from 67.93%) with same features. The prediction model and analysis tool in this paper not only gave the theoretical optimized prediction method, but also provided the attribution explanation of risk states and transition direction of each patient, which provided a favorable tool for doctors to analyze and diagnose the disease.

[1]  Carol Coupland,et al.  Derivation and validation of QStroke score for predicting risk of ischaemic stroke in primary care and comparison with other risk scores: a prospective open cohort study , 2013, BMJ.

[2]  Yu Liu,et al.  Gradient Harmonized Single-stage Detector , 2018, AAAI.

[3]  Yu Cao,et al.  An integrated machine learning approach to stroke prediction , 2010, KDD.

[4]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[5]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[6]  Jihoon G. Yoon,et al.  Machine Learning–Based Model for Prediction of Outcomes in Acute Stroke , 2019, Stroke.

[7]  Yunming Ye,et al.  DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.

[8]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[9]  Thomas Lumley,et al.  A stroke prediction score in the elderly: validation and Web-based application. , 2002, Journal of clinical epidemiology.

[10]  Chi-Chun Lee,et al.  Development of an intelligent decision support system for ischemic stroke risk assessment in a population-based electronic health record database , 2019, PloS one.

[11]  Se Jin Park,et al.  AI-Based Stroke Disease Prediction System Using Real-Time Electromyography Signals , 2020, Applied Sciences.

[12]  L. Shapley A Value for n-person Games , 1988 .

[13]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[14]  Xueli Yang,et al.  Predicting 10-Year and Lifetime Stroke Risk in Chinese Population. , 2019, Stroke.

[15]  A. Padovani,et al.  Synergistic Effect of Apolipoprotein E Polymorphisms and Cigarette Smoking on Risk of Ischemic Stroke in Young Adults , 2004, Stroke.

[16]  Ming Liu,et al.  Stroke in China: epidemiology, prevention, and management strategies , 2007, The Lancet Neurology.

[17]  Maruf Pasha,et al.  Survey of Machine Learning Algorithms for Disease Diagnostic , 2017 .

[18]  R. D'Agostino,et al.  Revised Framingham Stroke Risk Profile to Reflect Temporal Trends , 2017, Circulation.

[19]  Ankur Teredesai,et al.  Interpretable Machine Learning in Healthcare , 2018, 2018 IEEE International Conference on Healthcare Informatics (ICHI).

[20]  Zhe Zhao,et al.  Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts , 2018, KDD.

[21]  Liping Liu,et al.  Stroke and stroke care in China: huge burden, significant workload, and a national priority. , 2011, Stroke.

[22]  R B D'Agostino,et al.  Probability of stroke: a risk profile from the Framingham Study. , 1991, Stroke.

[23]  Chi-Chun Lee,et al.  Comparing deep neural network and other machine learning algorithms for stroke prediction in a large-scale population-based electronic medical claims database , 2017, 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[24]  Scott M. Lundberg,et al.  Consistent feature attribution for tree ensembles , 2017, ArXiv.