SP-SEDT: Self-supervised Pre-training for Sound Event Detection Transformer
暂无分享,去创建一个
[1] Junying Chen,et al. UP-DETR: Unsupervised Pre-training for Object Detection with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Yan Song,et al. Robust sound event recognition using convolutional neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Tomoki Toda,et al. Weakly-Supervised Sound Event Detection with Self-Attention , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[5] Yueliang Qian,et al. Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event Detection , 2021, ArXiv.
[6] Ankit Shah,et al. Sound Event Detection in Domestic Environments with Weakly Labeled Data and Soundscape Synthesis , 2019, DCASE.
[7] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[8] Xiangdong Wang,et al. Learning generic feature representation with synthetic data for weakly-supervised sound event detection by inter-frame distance loss , 2020, ArXiv.
[9] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[10] Ting Gong,et al. Label-efficient audio classification through multitask learning and self-supervision , 2019, ArXiv.
[11] Bhiksha Raj,et al. Improving weakly supervised sound event detection with self-supervised auxiliary tasks , 2021, Interspeech.
[12] Aren Jansen,et al. CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Xiangdong Wang,et al. Guided Learning Convolution System for DCASE 2019 Task 4 , 2019, DCASE.
[15] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[16] Ye Wang,et al. A-CRNN: A Domain Adaptation Model for Sound Event Detection , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Vittorio Murino,et al. Audio Surveillance: a Systematic Review , 2014 .
[18] Annamaria Mesaros,et al. Metrics for Polyphonic Sound Event Detection , 2016 .