Dense CNN With Self-Attention for Time-Domain Speech Enhancement
暂无分享,去创建一个
[1] Jinwon Lee,et al. A Fully Convolutional Neural Network for Speech Enhancement , 2016, INTERSPEECH.
[2] Antonio Bonafonte,et al. SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.
[3] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[4] DeLiang Wang,et al. A New Framework for Supervised Speech Enhancement in the Time Domain , 2018, INTERSPEECH.
[5] Kuldip K. Paliwal,et al. The importance of phase in speech enhancement , 2011, Speech Commun..
[6] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[8] Jungwon Lee,et al. T-GSA: Transformer with Gaussian-Weighted Self-Attention for Speech Enhancement , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Umut Isik,et al. Attention Wave-U-Net for Speech Enhancement , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[10] Jae S. Lim,et al. The unimportance of phase in speech enhancement , 1982 .
[11] Marc Delcroix,et al. Speech Enhancement Using Self-Adaptation and Multi-Head Self-Attention , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] DeLiang Wang,et al. Learning Complex Spectral Mapping for Speech Enhancement with Improved Cross-Corpus Generalization , 2020, INTERSPEECH.
[13] Shuang Xu,et al. Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Daniel Rueckert,et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] DeLiang Wang,et al. Exploring Deep Complex Networks for Complex Spectrogram Enhancement , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Xavier Serra,et al. A Wavenet for Speech Denoising , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Han Zhang,et al. Self-Attention Generative Adversarial Networks , 2018, ICML.
[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[19] Deliang Wang,et al. On Cross-Corpus Generalization of Deep Learning Based Speech Enhancement , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[20] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[21] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[22] Vincent Dumoulin,et al. Deconvolution and Checkerboard Artifacts , 2016 .
[23] Balaji Thoshkahna,et al. Voice and accompaniment separation in music using self-attention convolutional neural network , 2020, ArXiv.
[24] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[25] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[26] Yu Tsao,et al. Speech enhancement based on deep denoising autoencoder , 2013, INTERSPEECH.
[27] DeLiang Wang,et al. TCNN: Temporal Convolutional Neural Network for Real-time Speech Enhancement in the Time Domain , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Yu Tsao,et al. End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[29] DeLiang Wang,et al. Complex Ratio Masking for Monaural Speech Separation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[30] Yu Tsao,et al. Complex spectrogram enhancement by convolutional neural network with multi-metrics learning , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).
[31] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[32] DeLiang Wang,et al. Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[33] DeLiang Wang,et al. Densely Connected Neural Network with Dilated Convolutions for Real-Time Speech Enhancement in The Time Domain , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] DeLiang Wang,et al. On Adversarial Training and Loss Functions for Speech Enhancement , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Takuya Yoshioka,et al. Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[36] Mark Hasegawa-Johnson,et al. Speech Enhancement Using Bayesian Wavenet , 2017, INTERSPEECH.
[37] DeLiang Wang,et al. Monaural Speech Dereverberation Using Temporal Convolutional Networks With Self Attention , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[38] DeLiang Wang,et al. Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[39] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[40] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[41] DeLiang Wang,et al. On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[42] DeLiang Wang,et al. Long short-term memory for speaker generalization in supervised speech separation. , 2017, The Journal of the Acoustical Society of America.
[43] Jonathan Le Roux,et al. Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[44] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[45] Yu Tsao,et al. SNR-Aware Convolutional Neural Network Modeling for Speech Enhancement , 2016, INTERSPEECH.
[46] DeLiang Wang,et al. Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises. , 2016, The Journal of the Acoustical Society of America.
[47] DeLiang Wang,et al. A New Framework for CNN-Based Speech Enhancement in the Time Domain , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[48] DeLiang Wang,et al. Towards Scaling Up Classification-Based Speech Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[49] Björn W. Schuller,et al. Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR , 2015, LVA/ICA.
[50] DeLiang Wang,et al. Gated Residual Networks With Dilated Convolutions for Monaural Speech Enhancement , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.