PANACEA cough sound-based diagnosis of COVID-19 for the DiCOVA 2021 Challenge

The COVID-19 pandemic has led to the saturation of public health services worldwide. In this scenario, the early diagnosis of SARS-Cov-2 infections can help to stop or slow the spread of the virus and to manage the demand upon health services. This is especially important when resources are also being stretched by heightened demand linked to other seasonal diseases, such as the flu. In this context, the organisers of the DiCOVA 2021 challenge have collected a database with the aim of diagnosing COVID-19 through the use of coughing audio samples. This work presents the details of the automatic system for COVID-19 detection from cough recordings presented by team PANACEA. This team consists of researchers from two European academic institutions and one company: EURECOM (France), University of Granada (Spain), and Biometric Vox S.L. (Spain). We developed several systems based on established signal processing and machine learning methods. Our best system employs a Teager energy operator cepstral coefficients (TECCs) based frontend and Light gradient boosting machine (LightGBM) backend. The AUC obtained by this system on the test set is 76.31% which corresponds to a 10% improvement over the official baseline.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  P. Lomoro,et al.  COVID-19 pneumonia manifestations at the admission on chest ultrasound, radiographs, and CT: single-center study and comprehensive radiologic literature review , 2020, European Journal of Radiology Open.

[3]  Madhu R. Kamble,et al.  Analysis of Reverberation via Teager Energy Features for Replay Spoof Speech Detection , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Petros Maragos,et al.  Speech nonlinearities, modulations, and energy operators , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Prasanta Kumar Ghosh,et al.  DiCOVA Challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics , 2021, Interspeech.

[6]  Madhu R. Kamble,et al.  Detection of replay spoof speech using teager energy feature cues , 2021, Comput. Speech Lang..

[7]  Srikanth Raj Chetupalli,et al.  Coswara - A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis , 2020, INTERSPEECH.

[8]  Petros Maragos,et al.  On separating amplitude from frequency modulations using energy operators , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  S. Molau,et al.  Feature space normalization in adverse acoustic conditions , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10]  David Atienza,et al.  The COUGHVID crowdsourcing dataset, a corpus for the study of large-scale cough analysis algorithms , 2021, Scientific data.

[11]  Madhu R. Kamble,et al.  Effectiveness of Speech Demodulation-Based Features for Replay Detection , 2018, INTERSPEECH.

[12]  Petros Maragos,et al.  On amplitude and frequency demodulation using energy operators , 1993, IEEE Trans. Signal Process..

[13]  R. Agha,et al.  World Health Organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19) , 2020, International Journal of Surgery.

[14]  David D. Cox,et al.  Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[15]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[16]  Sanjeev Khudanpur,et al.  X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).