IMPROVING SOUND EVENT DETECTION WITH AUXILIARY FOREGROUND-BACKGROUND CLASSIFICATION AND DOMAIN ADAPTATION

In this paper we provide two methods that improve the detection of sound events in domestic environments. First, motivated by the broad categorization of domestic sounds as foreground or background events according to their spectro-temporal structure, we pro-pose to learn a foreground-background classifier jointly with the sound event classifier in a multi-task fashion to improve the generalization of the latter. Second, while the semi-supervised learning capability adopted for training sound event detection systems with synthetic labeled data and unlabeled or partially labeled real data aims to learn invariant representations for both domains, there is still a gap in performance when testing such systems on real environments. To further reduce this data mismatch, we propose a domain adaptation strategy that aligns the empirical distributions of the feature representations of active and inactive frames of synthetic and real recordings via optimal transport. We show that these two approaches lead to enhanced detection performance in terms of the event-based macro F1-score on the DESED dataset.

[1]  Stefano Fasciani,et al.  A Study of Features and Deep Neural Network Architectures and Hyper-Parameters for Domestic Audio Classification , 2021, Applied Sciences.

[2]  Xiaofei Li,et al.  Semi-supervised Sound Event Detection using Random Augmentation and Consistency Regularization , 2021, ArXiv.

[3]  Padmanabhan Rajan,et al.  Learning to Separate: Soundscape Classification using Foreground and Background , 2021, 2020 28th European Signal Processing Conference (EUSIPCO).

[4]  Xiangdong Wang,et al.  Guided Multi-Branch Learning Systems for Sound Event Detection with Sound Separation , 2020, DCASE.

[5]  Nicolas Turpault,et al.  Training Sound Event Detection on a Heterogeneous Dataset , 2020, DCASE.

[6]  Emmanuel Vincent,et al.  Foreground-Background Ambient Sound Scene Separation , 2020, 2020 28th European Signal Processing Conference (EUSIPCO).

[7]  Justin Salamon,et al.  Sound Event Detection in Synthetic Domestic Environments , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Sungrack Yun,et al.  Weakly Labeled Sound Event Detection Using Tri-training and Adversarial Learning , 2019, DCASE.

[9]  Ankit Shah,et al.  Sound Event Detection in Domestic Environments with Weakly Labeled Data and Soundscape Synthesis , 2019, DCASE.

[10]  J. Bello,et al.  Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018) , 2018 .

[11]  Nicolas Courty,et al.  DeepJDOT: Deep Joint distribution optimal transport for unsupervised domain adaptation , 2018, ECCV.

[12]  Justin Salamon,et al.  Scaper: A library for soundscape synthesis and augmentation , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[13]  Nicolas Courty,et al.  Joint distribution optimal transportation for domain adaptation , 2017, NIPS.

[14]  Francesc Alías,et al.  homeSound: Real-Time Audio Event Detection Based on High Performance Computing for Behaviour and Surveillance Remote Monitoring , 2017, Sensors.

[15]  Harri Valpola,et al.  Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.

[16]  Maria E. Niessen,et al.  Monitoring Activities of Daily Living in Smart Homes: Understanding human behavior , 2016, IEEE Signal Processing Magazine.

[17]  Nicolas Courty,et al.  POT: Python Optimal Transport , 2021, J. Mach. Learn. Res..

[18]  Zhenwei Hou,et al.  Two-Stage Domain Adaptation for Sound Event Detection , 2020, DCASE.

[19]  S. Squartini,et al.  Domain-Adversarial Training and Trainable Parallel Front-End for the DCASE 2020 Task 4 Sound Event Detection Challenge , 2020, DCASE.

[20]  K. Takeda,et al.  Conformer-Based Sound Event Detection with Semi-Supervised Learning and Data Augmentation , 2020, DCASE.

[21]  Diego de Benito-Gorrón,et al.  A Multi-Resolution Approach to Sound Event Detection in DCASE 2020 Task4 , 2020, DCASE.

[22]  Lionel Delphin-Poulat,et al.  MEAN TEACHER WITH DATA AUGMENTATION FOR DCASE 2019 TASK 4 Technical Report , 2019 .

[23]  W. Hager,et al.  and s , 2019, Shallow Water Hydraulics.

[24]  W. Marsden I and J , 2012 .

[25]  P. W. J. van Hengel,et al.  Audio Event Detection for In-Home Care , 2009 .

[26]  I. Miyazaki,et al.  AND T , 2022 .