Masking and Inpainting: A Two-Stage Speech Enhancement Approach for Low SNR and Non-Stationary Noise
暂无分享,去创建一个
Zhiyu Wang | Wei Chen | Xiang Hao | Feilong Bao | Xiangdong Su | Shixue Wen | Yiqian Pan | Zhiyu Wang | F. Bao | Xiangdong Su | Shi-Xue Wen | Xiang Hao | Yiqian Pan | Wei Chen
[1] Jesper Jensen,et al. A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[2] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[4] DeLiang Wang,et al. Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[5] DeLiang Wang,et al. A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement , 2018, INTERSPEECH.
[6] A. Sreenivasa Murthy,et al. Comparison of Speech Enhancement Algorithms , 2016 .
[7] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[8] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[9] Ting-Chun Wang,et al. Partial Convolution based Padding , 2018, ArXiv.
[10] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.
[11] DeLiang Wang,et al. On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.
[12] Ting-Chun Wang,et al. Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.
[13] Chao Yang,et al. Contextual-Based Image Inpainting: Infer, Match, and Translate , 2017, ECCV.
[14] Bernhard Schölkopf,et al. EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[15] Simon Dixon,et al. Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation , 2018, ISMIR.
[16] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .
[17] Andrea Vedaldi,et al. Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.