Analysis of Machine Learning Algorithms for Diagnosis of Diffuse Lung Diseases

BACKGROUND  Diffuse lung diseases (DLDs) are a diverse group of pulmonary disorders, characterized by inflammation of lung tissue, which may lead to permanent loss of the ability to breathe and death. Distinguishing among these diseases is challenging to physicians due their wide variety and unknown causes. Computer-aided diagnosis (CAD) is a useful approach to improve diagnostic accuracy, by combining information provided by experts with Machine Learning (ML) methods. OBJECTIVES  Exploring the potential of dimensionality reduction combined with ML methods for diagnosis of DLDs; improving the classification accuracy over state-of-the-art methods. METHODS  A data set composed of 3252 regions of interest (ROIs) was used, from which 28 features were extracted per ROI. We used Principal Component Analysis, Linear Discriminant Analysis, and Stepwise Selection - Forward, Backward, and Forward-Backward to reduce feature dimensionality. The feature subsets obtained were used as input to the following ML methods: Support Vector Machine, Gaussian Mixture Model, k-Nearest Neighbor, and Deep Feedforward Neural Network. We also applied a Deep Convolutional Neural Network directly to the ROIs. RESULTS  We achieved the maximum reduction from 28 to 5 dimensions using LDA. The best classification results were obtained by DFNN, with 99.60% of overall accuracy. CONCLUSIONS  This work contributes to the analysis and selection of features that can efficiently characterize the DLDs studied.

[1]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[2]  K. Doi,et al.  Quantitative computerized analysis of diffuse lung disease in high-resolution computed tomography. , 2003, Medical physics.

[3]  Rangaraj M. Rangayyan,et al.  Fuzzy membership functions for analysis of high-resolution CT images of diffuse pulmonary diseases , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[4]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[5]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[6]  Mohammed J. Zaki Data Mining and Analysis: Fundamental Concepts and Algorithms , 2014 .

[7]  Rangaraj M. Rangayyan,et al.  Detection of Architectural Distortion in Prior Mammograms , 2011, IEEE Transactions on Medical Imaging.

[8]  Kenji Suzuki,et al.  Deep neural network convolution (NNC) for three-class classification of diffuse lung disease opacities in high-resolution CT (HRCT): consolidation, ground-glass opacity (GGO), and normal opacity , 2018, Medical Imaging.

[9]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[10]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[11]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[12]  Marios Anthimopoulos,et al.  Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network , 2016, IEEE Transactions on Medical Imaging.

[13]  Shih-Yin Chen,et al.  Idiopathic pulmonary fibrosis in US Medicare beneficiaries aged 65 years and older: incidence, prevalence, and survival, 2001-11. , 2014, The Lancet. Respiratory medicine.

[14]  M Thelen,et al.  Automatic detection and quantification of ground-glass opacities on high-resolution CT using multiple neural networks: comparison with a density mask. , 2000, AJR. American journal of roentgenology.

[15]  R. Rangayyan Biomedical Image Analysis , 2004 .

[16]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[17]  Kenneth I. Laws,et al.  Rapid Texture Identification , 1980, Optics & Photonics.

[18]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[20]  Marios Anthimopoulos,et al.  Multi-source Transfer Learning with Convolutional Neural Networks for Lung Pattern Analysis , 2016, IEEE journal of biomedical and health informatics.

[21]  Rangaraj M. Rangayyan,et al.  Gaussian mixture modeling for statistical analysis of features of high-resolution CT images of diffuse pulmonary diseases , 2015, 2015 IEEE International Symposium on Medical Measurements and Applications (MeMeA) Proceedings.

[22]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[23]  Rangaraj M. Rangayyan,et al.  Fractal analysis for computer-aided diagnosis of diffuse pulmonary diseases in HRCT images , 2014, 2014 IEEE International Symposium on Medical Measurements and Applications (MeMeA).