A Dual-Staged Context Aggregation Method towards Efficient End-to-End Speech Enhancement
暂无分享,去创建一个
[1] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[2] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[3] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[4] DeLiang Wang,et al. On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.
[5] S. Boll,et al. Suppression of acoustic noise in speech using spectral subtraction , 1979 .
[6] DeLiang Wang,et al. Ideal ratio mask estimation using deep neural networks for robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[7] Shengwu Xiong,et al. Densely Connected Network with Time-frequency Dilated Convolution for Speech Enhancement , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Minje Kim,et al. Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding , 2019, INTERSPEECH.
[9] DeLiang Wang,et al. Complex Ratio Masking for Monaural Speech Separation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[10] DeLiang Wang,et al. Long short-term memory for speaker generalization in supervised speech separation. , 2017, The Journal of the Acoustical Society of America.
[11] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[12] Rémi Gribonval,et al. Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[13] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.
[15] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .
[16] Mikkel N. Schmidt,et al. Single-channel speech separation using sparse non-negative matrix factorization , 2006, INTERSPEECH.
[17] DeLiang Wang,et al. A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement , 2018, INTERSPEECH.
[18] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Antonio Bonafonte,et al. SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.
[20] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[21] Simon Dixon,et al. Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation , 2018, ISMIR.
[22] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[23] Yu Tsao,et al. Incorporating Symbolic Sequential Modeling for Speech Enhancement , 2019, INTERSPEECH.
[24] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[25] Yu Tsao,et al. End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] Guanrong Chen,et al. Introduction to random signals and applied Kalman filtering, 2nd edn. Robert Grover Brown and Patrick Y. C. Hwang, Wiley, New York, 1992. ISBN 0‐471‐52573‐1, 512 pp., $62.95. , 1992 .
[27] Sunwoo Kim,et al. Incremental Binarization on Recurrent Neural Networks for Single-channel Source Separation , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Naoya Takahashi,et al. Mmdenselstm: An Efficient Combination of Convolutional and Recurrent Neural Networks for Audio Source Separation , 2018, 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).
[29] DeLiang Wang,et al. Gated Residual Networks with Dilated Convolutions for Supervised Speech Separation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).