Lightweight Causal Transformer with Local Self-Attention for Real-Time Speech Enhancement
暂无分享,去创建一个
[1] Kuldip K. Paliwal,et al. Masked multi-head self-attention for causal speech enhancement , 2020, Speech Commun..
[2] Rémi Gribonval,et al. Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[5] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[7] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .
[8] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[9] Marc Delcroix,et al. Speech Enhancement Using Self-Adaptation and Multi-Head Self-Attention , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Kuldip K. Paliwal,et al. Deep learning for minimum mean-square error approaches to speech enhancement , 2019, Speech Commun..
[11] Di He,et al. Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View , 2019, ArXiv.
[12] Alex Waibel,et al. Noise reduction using connectionist models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[13] Jesper Jensen,et al. An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[14] Jack W. Rae,et al. Do Transformers Need Deep Long-Range Memory? , 2020, ACL.
[15] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[16] Jungwon Lee,et al. T-GSA: Transformer with Gaussian-Weighted Self-Attention for Speech Enhancement , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Lukasz Kaiser,et al. Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.
[18] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Jun Du,et al. An Experimental Study on Speech Enhancement Based on Deep Neural Networks , 2014, IEEE Signal Processing Letters.
[20] Ross Cutler,et al. Interspeech 2021 Deep Noise Suppression Challenge , 2021, ArXiv.
[21] Georg Heigold,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2021, ICLR.
[22] Björn W. Schuller,et al. Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR , 2015, LVA/ICA.
[23] Noam Shazeer,et al. GLU Variants Improve Transformer , 2020, ArXiv.
[24] Douglas Eck,et al. Music Transformer , 2018, 1809.04281.
[25] Angel Manuel Gomez,et al. A Deep Learning Loss Function Based on the Perceptual Evaluation of the Speech Quality , 2018, IEEE Signal Processing Letters.
[26] Pascal Scalart,et al. Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[27] S. Boll,et al. Suppression of acoustic noise in speech using spectral subtraction , 1979 .
[28] Phil D. Green,et al. Speech enhancement with missing data techniques using recurrent neural networks , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[29] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[30] Jiri Malek,et al. Single channel speech enhancement using convolutional neural network , 2017, 2017 IEEE International Workshop of Electronics, Control, Measurement, Signals and their Application to Mechatronics (ECMSM).
[31] Jun Du,et al. Frequency Gating: Improved Convolutional Neural Networks for Speech Enhancement in the Time-Frequency Domain , 2020, 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[32] Jinwon Lee,et al. A Fully Convolutional Neural Network for Speech Enhancement , 2016, INTERSPEECH.
[33] Antonio Bonafonte,et al. SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.
[34] Bhiksha Raj,et al. Speech denoising using nonnegative matrix factorization with priors , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[35] Yi Hu,et al. A generalized subspace approach for enhancing speech corrupted by colored noise , 2003, IEEE Trans. Speech Audio Process..
[36] Ian McLoughlin,et al. Self-Attention Generative Adversarial Network for Speech Enhancement , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] Jun Du,et al. A speech enhancement approach using piecewise linear approximation of an explicit model of environmental distortions , 2008, INTERSPEECH.