Respiratory Sound Classification Using an Attention LSTM Model with Mixup Data Augmentation

Auscultation is the most common method for the diagnosis of respiratory diseases, although it depends largely on the physician’s ability. In order to alleviate this drawback, in this paper, we present an automatic system capable of distinguishing between different types of lung sounds (neutral, wheeze, crackle) in patient’s respiratory recordings. In particular, the proposed system is based on Long Short Term-Memory (LSTM) networks fed with log-mel spectrograms, on which several improvements have been developed. Firstly, the frequency bands that contain more useful information have been experimentally determined in order to enhance the input acoustic features. Secondly, an Attention Mechanism has been incorporated into the LSTM model in order to emphasize the more relevant audio frames to the task under consideration. Finally, a Mixup data augmentation technique has been adopted in order to mitigate the problem of data imbalance and improve the sensitivity of the system. The proposed methods have been evaluated over the publicly available ICBHI 2017 dataset, achieving good results in comparison to the baseline.

[1]  Ascensión Gallardo-Antolín,et al.  An Auditory Saliency Pooling-Based LSTM Model for Speech Intelligibility Classification , 2021, Symmetry.

[2]  Juan Manuel Montero-Martínez,et al.  On combining acoustic and modulation spectrograms in an attention LSTM-based system for speech intelligibility level classification , 2021, Neurocomputing.

[3]  Matias Garcia-Constantino,et al.  Attention-Inspired Artificial Neural Networks for Speech Processing: A Systematic Review , 2021, Symmetry.

[4]  Rui Pedro Paiva,et al.  Automatic Classification of Adventitious Respiratory Sounds: A (Un)Solved Problem? † , 2020, Sensors.

[5]  Ascensión Gallardo-Antolín,et al.  An attention Long Short-Term Memory based system for automatic classification of speech intelligibility , 2020, Eng. Appl. Artif. Intell..

[6]  Yi Ma,et al.  LungRN+NL: An Improved Adventitious Lung Sound Classification Using Non-Local Block ResNet Neural Network with Mixup Data Augmentation , 2020, INTERSPEECH.

[7]  Arindam Basu,et al.  Deep Neural Network for Respiratory Sound Classification in Wearable Devices Enabled by Patient Specific Model Tuning , 2020, IEEE Transactions on Biomedical Circuits and Systems.

[8]  Ting Yu,et al.  Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study , 2020, The Lancet.

[9]  Ian McLoughlin,et al.  Robust Deep Learning Framework For Predicting Respiratory Anomalies and Diseases , 2020, 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC).

[10]  Andrea Tagarelli,et al.  Deep Auscultation: Predicting Respiratory Anomalies and Diseases via Recurrent Neural Networks , 2019, 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS).

[11]  Ioanna Chouvarda,et al.  An open access database for the evaluation of respiratory sound classification algorithms , 2019, Physiological measurement.

[12]  N. Jakovljević,et al.  Hidden Markov Model Based Respiratory Sound Classification , 2017, BHI 2017.

[13]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[14]  Seyedmahdad Mirsamadi,et al.  Automatic speech emotion recognition using recurrent neural networks with local attention , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Gorkem Serbes,et al.  A lung sound classification system based on the rational dilation wavelet transform , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[16]  Goutam Saha,et al.  Lung sound classification using cepstral-based statistical features , 2016, Comput. Biol. Medicine.

[17]  José Antonio Fiz,et al.  Automatic Differentiation of Normal and Continuous Adventitious Respiratory Sounds Using Ensemble Empirical Mode Decomposition and Instantaneous Frequency , 2016, IEEE Journal of Biomedical and Health Informatics.

[18]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[19]  Kenneth Sundaraj,et al.  A comparative study of the svm and k-nn machine learning algorithms for the diagnosis of respiratory pathologies using pulmonary acoustic signals , 2014, BMC Bioinformatics.

[20]  L. Nieman,et al.  Pulmonary auscultatory skills during training in internal medicine and family practice. , 1999, American journal of respiratory and critical care medicine.

[21]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[22]  P. Piirilä,et al.  Crackles: recording, analysis and clinical significance. , 1995, The European respiratory journal.

[23]  C. Dolea,et al.  World Health Organization , 1949, International Organization.

[24]  Ioanna Chouvarda,et al.  Precision Medicine Powered by pHealth and Connected Health , 2018 .

[25]  Gorkem Serbes,et al.  An Automated Lung Sound Preprocessing and Classification System Based OnSpectral Analysis Methods , 2018 .

[26]  Sueharu Miyahara,et al.  Classification between normal and abnormal respiratory sounds based on stochastic approach , 2010 .

[27]  Jen-Chien Chien,et al.  Respiratory Wheeze Detection System , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.