DF-Conformer: Integrated Architecture of Conv-Tasnet and Conformer Using Linear Complexity Self-Attention for Speech Enhancement
暂无分享,去创建一个
Scott Wisdom | John R. Hershey | Michiel Bacchiani | Yuma Koizumi | Hakan Erdogan | Shigeki Karita | Llion Jones | Llion Jones | M. Bacchiani | J. Hershey | Scott Wisdom | Hakan Erdogan | Yuma Koizumi | Shigeki Karita
[1] Ming Zhou,et al. Continuous Speech Separation with Conformer , 2020, ArXiv.
[2] Dong Liu,et al. Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation , 2020, INTERSPEECH.
[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[4] Zhuo Chen,et al. ESPnet-SE: End-To-End Speech Enhancement and Separation Toolkit Designed for ASR Integration , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[5] Mirco Ravanelli,et al. Attention Is All You Need In Speech Separation , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] K. Takeda,et al. Conformer-Based Sound Event Detection with Semi-Supervised Learning and Data Augmentation , 2020, DCASE.
[7] Efthymios Tzinis,et al. Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds , 2020, ICLR.
[8] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[9] Antoine Deleforge,et al. Filterbank Design for End-to-end Speech Separation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Tomohiro Nakatani,et al. Frame-by-Frame Closed-Form Update for Mask-Based Adaptive MVDR Beamforming , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Joon Son Chung,et al. The Conversation: Deep Audio-Visual Speech Enhancement , 2018, INTERSPEECH.
[12] Sebastian Braun,et al. ICASSP 2021 Deep Noise Suppression Challenge , 2020 .
[13] Jonathan Le Roux,et al. Universal Sound Separation , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[14] Tomohiro Nakatani,et al. Improving Noise Robust Automatic Speech Recognition with Single-Channel Time-Domain Enhancement Network , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Scott Wisdom,et al. Differentiable Consistency Constraints for Improved Deep Speech Enhancement , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Jonathan Le Roux,et al. Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks , 2016, INTERSPEECH.
[17] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[18] DeLiang Wang,et al. Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[19] Shmuel Peleg,et al. Seeing Through Noise: Visually Driven Speaker Separation And Enhancement , 2017, ICASSP.
[20] Wei-Ping Zhu,et al. TSTNN: Two-Stage Transformer Based Neural Network for Speech Enhancement in the Time Domain , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Sebastian Braun,et al. Towards Efficient Models for Real-Time Deep Noise Suppression , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Ron J. Weiss,et al. Unsupervised Sound Separation Using Mixture Invariant Training , 2020, NeurIPS.
[23] Jesper Jensen,et al. An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] Nikolaos Pappas,et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention , 2020, ICML.
[25] Yi Tay,et al. Efficient Transformers: A Survey , 2020, ArXiv.
[26] Yi Luo,et al. Ultra-Lightweight Speech Separation Via Group Communication , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[28] Scott Wisdom,et al. End-To-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Yusuke Hioka,et al. DNN-Based Source Enhancement to Increase Objective Sound Quality Assessment Score , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[30] Lukasz Kaiser,et al. Rethinking Attention with Performers , 2020, ArXiv.
[31] Efthymios Tzinis,et al. Asteroid: the PyTorch-based audio source separation toolkit for researchers , 2020, INTERSPEECH.