Know Your Enemy, Know Yourself: A Unified Two-Stage Framework for Speech Enhancement
暂无分享,去创建一个
Andong Li | Yuxuan Ke | Xiaodong Li | Wenzhe Liu | Chengshi Zheng | C. Zheng | Xiaodong Li | Andong Li | Wenzhe Liu | Yuxuan Ke
[1] DeLiang Wang,et al. A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement , 2018, INTERSPEECH.
[2] Jungwon Lee,et al. T-GSA: Transformer with Gaussian-Weighted Self-Attention for Speech Enhancement , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Jon Barker,et al. The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[4] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[5] Herman J. M. Steeneken,et al. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..
[6] David Malah,et al. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..
[7] Richard C. Hendriks,et al. Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[8] Philipos C. Loizou,et al. Speech Enhancement: Theory and Practice , 2007 .
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[11] Israel Cohen,et al. Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..
[12] Kilian Q. Weinberger,et al. CondenseNet: An Efficient DenseNet Using Learned Group Convolutions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[13] Rainer Martin,et al. Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..
[14] Chengshi Zheng,et al. ICASSP 2021 Deep Noise Suppression Challenge: Decoupling Magnitude and Phase Optimization with a Two-Stage Deep Network , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] DeLiang Wang,et al. Long short-term memory for speaker generalization in supervised speech separation. , 2017, The Journal of the Acoustical Society of America.
[17] Sebastian Braun,et al. ICASSP 2021 Deep Noise Suppression Challenge , 2020 .
[18] Wouter Tirry,et al. Separated Noise Suppression and Speech Restoration: Lstm-Based Speech Enhancement in Two Stages , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[19] Xiaodong Xie,et al. FFA-Net: Feature Fusion Attention Network for Single Image Dehazing , 2019, AAAI.
[20] Biing-Hwang Juang,et al. Speech Dereverberation Based on Variance-Normalized Delayed Linear Prediction , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .
[22] Jian Li,et al. A Constrained MMSE LP Residual Estimator for Speech Dereverberation in Noisy Environments , 2014, IEEE Signal Processing Letters.
[23] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .
[24] Philipos C. Loizou,et al. A multi-band spectral subtraction method for enhancing speech corrupted by colored noise , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[25] Sandhya Hawaldar,et al. Speech Enhancement for Nonstationary Noise Environments , 2011 .
[26] DeLiang Wang,et al. Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[27] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[28] Yu Tsao,et al. Complex spectrogram enhancement by convolutional neural network with multi-metrics learning , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).
[29] Antonio Bonafonte,et al. SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.
[30] Chengshi Zheng,et al. Two Heads are Better Than One: A Two-Stage Complex Spectral Mapping Approach for Monaural Speech Enhancement , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[31] Jesper Jensen,et al. An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[32] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[33] Rundi Wu,et al. Listening to Sounds of Silence for Speech Denoising , 2020, NeurIPS.
[34] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.
[35] DeLiang Wang,et al. Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[36] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.