A study of aortic dissection screening method based on multiple machine learning models

Background The main purpose of the study was to develop an early screening method for aortic dissection (AD) based on machine learning. Due to the rarity of AD and the complexity of symptoms, many doctors have no clinical experience with it. Many patients are not suspected of having AD, which lead to a high rate of misdiagnosis. Here, we report the preliminary study and feasibility of rapid and accurate screening method of AD with machine learning methods. Methods The dataset analyzed was composed by examination data provided by the Xiangya Hospital Central South University of China which include a total of 60,000 samples, including aortic patients and non-aortic ones. Each sample has 76 features which is consist of routine examinations and other easily accessible information. Since the proportion of people who are affected is usually imbalanced compared to non-diseased people, multiple machine learning models were used, include AdaBoost, SmoteBagging, EasyEnsemble and CalibratedAdaMEC. They used different methods such as ensemble learning, undersampling, oversampling, and cost-sensitivity to solve data imbalance problems. Results AdaBoost performed poorly with an average recall of 16.1% and a specificity of 99.8%. SmoteBagging achieved a statistically significant better performance for this problem with an average recall of 78.1% and a specificity of 79.2%. EasyEnsemble reached the values of 77.8% and 79.3% for recall and specificity respectively. CalibratedAdaMEC's recall and specificity are 75.8% and 76%. Conclusions It was found that the screening performance of the models evaluated in this paper had a misdiagnosis rate lower than 25% except AdaBoost. The data used in these methods are only routine inspection data. This means that machine learning methods can help us build a fast, cheap, worthwhile and effective early screening approach for AD.

[1]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[2]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[3]  Marlon Núñez The use of background knowledge in decision tree induction , 2004, Machine Learning.

[4]  Xin Yao,et al.  Diversity analysis on imbalanced data sets by using ensemble models , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[5]  Siti Mariyam Shamsuddin,et al.  Classification with class imbalance problem: A review , 2015, SOCO 2015.

[6]  E. Isselbacher Dissection of the descending thoracic aorta: looking into the future. , 2007, Journal of the American College of Cardiology.

[7]  J. Kai,et al.  Can machine-learning improve cardiovascular risk prediction using routine clinical data? , 2017, PloS one.

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[9]  A Evangelista,et al.  The International Registry of Acute Aortic Dissection (IRAD): new insights into an old disease. , 2000, JAMA.

[10]  I. Mészáros,et al.  Epidemiology and clinicopathology of aortic dissection. , 2000, Chest.

[11]  Taghi M. Khoshgoftaar,et al.  RUSBoost: Improving classification performance when training data is skewed , 2008, 2008 19th International Conference on Pattern Recognition.

[12]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Eric M Isselbacher,et al.  Presentation, Diagnosis, and Outcomes of Acute Aortic Dissection: 17-Year Trends From the International Registry of Acute Aortic Dissection. , 2015, Journal of the American College of Cardiology.

[14]  Melanie Hilario,et al.  Machine learning approaches to lung cancer prediction from mass spectra , 2003, Proteomics.

[15]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[16]  Da Huo,et al.  A machine learning model to classify aortic dissection patients in the early diagnosis phase , 2019, Scientific Reports.

[17]  Björn E. Ottersten,et al.  Example-dependent cost-sensitive decision trees , 2015, Expert Syst. Appl..

[18]  N. Khaki,et al.  The frequency of initial misdiagnosis of acute aortic dissection in the emergency department and its impact on outcome , 2017, Internal and Emergency Medicine.

[19]  Nitesh V. Chawla,et al.  SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.

[20]  M. Williams,et al.  Recommendations for accurate CT diagnosis of suspected acute aortic syndrome (AAS)—on behalf of the British Society of Cardiovascular Imaging (BSCI)/British Society of Cardiovascular CT (BSCCT) , 2016, The British journal of radiology.

[21]  A. Karthikesalingam,et al.  The Diagnosis and Management of Aortic Dissection , 2010, Vascular and endovascular surgery.

[22]  Igor Kononenko,et al.  Analysing and improving the diagnosis of ischaemic heart disease with machine learning , 1999, Artif. Intell. Medicine.

[23]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.