CONVOLUTION-AUGMENTED TRANSFORMER FOR SEMI-SUPERVISED SOUND EVENT DETECTION Technical Report
暂无分享,去创建一个
Tomoki Toda | Shinji Watanabe | Kazuya Takeda | Tomoki Hayashi | Tatsuya Komatsu | Koichi Miyazaki | K. Takeda | T. Toda | Shinji Watanabe | Tomoki Hayashi | Koichi Miyazaki | Tatsuya Komatsu
[1] Naoyuki Kanda,et al. End-to-End Neural Speaker Diarization with Self-Attention , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[2] Xiaofei Wang,et al. A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[3] Hongyi Zhang,et al. mixup: Beyond Empirical Risk Minimization , 2017, ICLR.
[4] Tomoki Toda,et al. Weakly-Supervised Sound Event Detection with Self-Attention , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Liyuan Liu,et al. On the Variance of the Adaptive Learning Rate and Beyond , 2019, ICLR.
[6] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.
[7] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[8] Sacha Krstulovic,et al. A Framework for the Robust Evaluation of Sound Event Detection , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[10] Harri Valpola,et al. Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.
[11] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[12] Lionel Delphin-Poulat,et al. MEAN TEACHER WITH DATA AUGMENTATION FOR DCASE 2019 TASK 4 Technical Report , 2019 .
[13] Annamaria Mesaros,et al. Metrics for Polyphonic Sound Event Detection , 2016 .
[14] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.