Monaural Speech Dereverberation Using Temporal Convolutional Networks With Self Attention
暂无分享,去创建一个
DeLiang Wang | Yan Zhao | Buye Xu | Tao Zhang | Deliang Wang | Buye Xu | T. Zhang | Yan Zhao
[1] DeLiang Wang,et al. A New Framework for CNN-Based Speech Enhancement in the Time Domain , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[2] K. Helfer,et al. Hearing loss, aging, and speech perception in reverberation and noise. , 1990, Journal of speech and hearing research.
[3] J.-M. Boucher,et al. A New Method Based on Spectral Subtraction for Speech Dereverberation , 2001 .
[4] Matthias Sperber,et al. Self-Attentional Acoustic Models , 2018, INTERSPEECH.
[5] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[6] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[7] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.
[8] R. Maas,et al. A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research , 2016, EURASIP Journal on Advances in Signal Processing.
[9] Jae S. Lim,et al. Signal estimation from modified short-time Fourier transform , 1983, ICASSP.
[10] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[11] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] DeLiang Wang,et al. Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[14] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[15] Antonio Miguel,et al. Deep Speech Enhancement for Reverberated and Noisy Signals using Wide Residual Networks , 2019, ArXiv.
[16] Tao Zhang,et al. Late Reverberation Suppression Using Recurrent Neural Networks with Long Short-Term Memory , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Yu Tsao,et al. Incorporating Symbolic Sequential Modeling for Speech Enhancement , 2019, INTERSPEECH.
[18] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[19] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .
[20] Vladlen Koltun,et al. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.
[21] Biing-Hwang Juang,et al. Speech Dereverberation Based on Variance-Normalized Delayed Linear Prediction , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Tao Shen,et al. DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding , 2017, AAAI.
[23] Tao Zhang,et al. Learning Spectral Mapping for Speech Dereverberation and Denoising , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] I. McCowan,et al. The multi-channel Wall Street Journal audio visual corpus (MC-WSJ-AV): specification and initial experiments , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..
[25] J. Foote,et al. WSJCAM0: A BRITISH ENGLISH SPEECH CORPUS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION , 1995 .
[26] Peter Vary,et al. A binaural room impulse response database for the evaluation of dereverberation algorithms , 2009, 2009 16th International Conference on Digital Signal Processing.
[27] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[28] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.
[29] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[30] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[31] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[33] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.
[34] Steve Renals,et al. WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[35] Yi Hu,et al. Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. , 2009, The Journal of the Acoustical Society of America.
[36] Tomohiro Nakatani,et al. Making Machines Understand Us in Reverberant Rooms: Robustness Against Reverberation for Automatic Speech Recognition , 2012, IEEE Signal Process. Mag..
[37] Chin-Hui Lee,et al. A Reverberation-Time-Aware Approach to Speech Dereverberation Based on Deep Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[38] Gaël Richard,et al. Single channel reverberation suppression based on sparse linear prediction , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Haizhou Li,et al. Speech dereverberation for enhancement and recognition using dynamic features constrained deep neural networks and feature adaptation , 2016, EURASIP J. Adv. Signal Process..
[41] Tiago H. Falk,et al. A Non-Intrusive Quality and Intelligibility Measure of Reverberant and Dereverberated Speech , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[42] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[43] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[44] Tiago H. Falk,et al. Speech Dereverberation With Context-Aware Recurrent Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[45] DeLiang Wang,et al. Learning spectral mapping for speech dereverberation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[46] Masakiyo Fujimoto,et al. LINEAR PREDICTION-BASED DEREVERBERATION WITH ADVANCED SPEECH ENHANCEMENT AND RECOGNITION TECHNOLOGIES FOR THE REVERB CHALLENGE , 2014 .
[47] Yong Xu,et al. An Attention-based Neural Network Approach for Single Channel Speech Enhancement , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[48] DeLiang Wang,et al. A two-stage algorithm for one-microphone reverberant speech enhancement , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[49] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[50] Garrison W. Cottrell,et al. Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).