A Weighted Voting Ensemble Self-Labeled Algorithm for the Detection of Lung Abnormalities from X-Rays

During the last decades, intensive efforts have been devoted to the extraction of useful knowledge from large volumes of medical data employing advanced machine learning and data mining techniques. Advances in digital chest radiography have enabled research and medical centers to accumulate large repositories of classified (labeled) images and mostly of unclassified (unlabeled) images from human experts. Machine learning methods such as semi-supervised learning algorithms have been proposed as a new direction to address the problem of shortage of available labeled data, by exploiting the explicit classification information of labeled data with the information hidden in the unlabeled data. In the present work, we propose a new ensemble semi-supervised learning algorithm for the classification of lung abnormalities from chest X-rays based on a new weighted voting scheme. The proposed algorithm assigns a vector of weights on each component classifier of the ensemble based on its accuracy on each class. Our numerical experiments illustrate the efficiency of the proposed ensemble methodology against other state-of-the-art classification methods.

[1]  Michelangelo Ceci,et al.  Semi-supervised classification trees , 2017, Journal of Intelligent Information Systems.

[2]  Michelangelo Ceci,et al.  Self-training for multi-target regression with tree ensembles , 2017, Knowl. Based Syst..

[3]  O. Chapelle,et al.  Semi-Supervised Learning (Chapelle, O. et al., Eds.; 2006) [Book reviews] , 2009, IEEE Transactions on Neural Networks.

[4]  Zhongsheng Hua,et al.  Semi-supervised learning based on nearest neighbor rule and cut edges , 2010, Knowl. Based Syst..

[5]  Zhi-Hua Zhou,et al.  Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[6]  Panayiotis E. Pintelas,et al.  An Auto-Adjustable Semi-Supervised Self-Training Algorithm , 2018, Algorithms.

[7]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[8]  Saso Dzeroski,et al.  Semi-Supervised Learning for Quantitative Structure-Activity Modeling , 2013, Informatica.

[9]  H. Finner On a Monotonicity Problem in Step-Down Multiple Test Procedures , 1993 .

[10]  Stephen M. Moore,et al.  The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository , 2013, Journal of Digital Imaging.

[11]  Francisco Herrera,et al.  Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study , 2015, Knowledge and Information Systems.

[12]  Panayiotis E. Pintelas,et al.  Detecting Lung Abnormalities From X-rays Using an Improved SSL Algorithm , 2019, BRAINS/WS-AFFIN@AmI.

[13]  Michelangelo Ceci,et al.  Semi-supervised trees for multi-target regression , 2018, Inf. Sci..

[14]  Bram van Ginneken,et al.  Segmentation of anatomical structures in chest radiographs using supervised methods: a comparative study on a public database , 2006, Medical Image Anal..

[15]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[16]  Clement J. McDonald,et al.  Automatic Tuberculosis Screening Using Chest Radiographs , 2014, IEEE Transactions on Medical Imaging.

[17]  Bram van Ginneken,et al.  A Novel Multiple-Instance Learning-Based Approach to Computer-Aided Detection of Tuberculosis on Chest X-Rays , 2015, IEEE Transactions on Medical Imaging.

[18]  Yaping Huang,et al.  Multi-label chest X-ray image classification via category-wise residual attention learning , 2020, Pattern Recognit. Lett..

[19]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[20]  Daniel S. Kermany,et al.  Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning , 2018, Cell.

[21]  Panayiotis E. Pintelas,et al.  On Ensemble SSL Algorithms for Credit Scoring Problem , 2018, Informatics.

[22]  Panayiotis E. Pintelas,et al.  An Ensemble SSL Algorithm for Efficient Chest X-Ray Image Classification , 2018, J. Imaging.

[23]  Saso Dzeroski,et al.  The importance of the label hierarchy in hierarchical multi-label classification , 2015, Journal of Intelligent Information Systems.