A Comparative Study of Time and Frequency Domain Approaches to Deep Learning based Speech Enhancement
暂无分享,去创建一个
Cornelius Glackin | Julie Wall | Nigel Cannings | Soha A. Nossier | Mansour Moniri | C. Glackin | M. Moniri | J. Wall | Nigel Cannings
[1] Onur Avci,et al. 1-D Convolutional Neural Networks for Signal Processing Applications , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Simon King,et al. The voice bank corpus: Design, collection and data analysis of a large regional accent speech database , 2013, 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE).
[3] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[4] Mark D. Plumbley,et al. Single channel audio source separation using convolutional denoising autoencoders , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[5] Tara N. Sainath,et al. Deep Learning for Audio Signal Processing , 2019, IEEE Journal of Selected Topics in Signal Processing.
[6] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Jiri Malek,et al. Single channel speech enhancement using convolutional neural network , 2017, 2017 IEEE International Workshop of Electronics, Control, Measurement, Signals and their Application to Mechatronics (ECMSM).
[8] Jinwon Lee,et al. A Fully Convolutional Neural Network for Speech Enhancement , 2016, INTERSPEECH.
[9] Yu Tsao,et al. Speech enhancement based on deep denoising autoencoder , 2013, INTERSPEECH.
[10] Yu Tsao,et al. SNR-Aware Convolutional Neural Network Modeling for Speech Enhancement , 2016, INTERSPEECH.
[11] Yu Zhang,et al. Very deep convolutional networks for end-to-end speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Yan Tang,et al. A perceptually-weighted deep neural network for monaural speech enhancement in various background noise conditions , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).
[13] DeLiang Wang,et al. Complex Ratio Masking for Monaural Speech Separation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[14] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[15] Jonathan Le Roux,et al. Phase Processing for Single-Channel Speech Enhancement: History and recent advances , 2015, IEEE Signal Processing Magazine.
[16] Yu Tsao,et al. End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[17] DeLiang Wang,et al. On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[18] Jinyu Li,et al. Feature Learning in Deep Neural Networks - Studies on Speech Recognition Tasks. , 2013, ICLR 2013.
[19] John Cristian Borges Gamboa,et al. Deep Learning for Time-Series Analysis , 2017, ArXiv.
[20] Deividas Eringis,et al. Improving Speech Recognition Rate through Analysis Parameters , 2014 .
[21] Antonio Bonafonte,et al. SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.
[22] Shadi Pirhosseinloo,et al. A new feature set for masking-based monaural speech separation , 2018, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.
[23] F. Harris. On the use of windows for harmonic analysis with the discrete Fourier transform , 1978, Proceedings of the IEEE.
[24] Wei Dai,et al. Very deep convolutional neural networks for raw waveforms , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Prajoy Podder,et al. Comparative Performance Analysis of Hamming, Hanning and Blackman Window , 2014 .
[26] Jae S. Lim,et al. The unimportance of phase in speech enhancement , 1982 .
[27] Jesper Jensen,et al. On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Onur Avci,et al. 1D Convolutional Neural Networks and Applications: A Survey , 2019, Mechanical Systems and Signal Processing.
[30] Yu Tsao,et al. Complex spectrogram enhancement by convolutional neural network with multi-metrics learning , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).
[31] Panayiotis G. Georgiou,et al. Perception Optimized Deep Denoising AutoEncoders for Speech Enhancement , 2016, INTERSPEECH.
[32] Philipos C. Loizou,et al. Speech Enhancement: Theory and Practice , 2007 .
[33] Benoit Champagne,et al. A Fully Convolutional Neural Network for Complex Spectrogram Processing in Speech Enhancement , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Ming Tu,et al. Speech enhancement based on Deep Neural Networks with skip connections , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Emanuel A. P. Habets,et al. Time-Frequency Masking Based Online Speech Enhancement with Multi-Channel Data Using Convolutional Neural Networks , 2018, 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).
[36] DeLiang Wang,et al. A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement , 2018, INTERSPEECH.
[37] DeLiang Wang,et al. Binary and ratio time-frequency masks for robust speech recognition , 2006, Speech Commun..
[38] DeLiang Wang,et al. A deep neural network for time-domain signal reconstruction , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] DeLiang Wang,et al. Deep learning reinvents the hearing aid , 2017, IEEE Spectrum.
[40] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[41] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[42] DeLiang Wang,et al. Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[43] Yi Hu,et al. Subjective comparison and evaluation of speech enhancement algorithms , 2007, Speech Commun..
[44] Kuldip K. Paliwal,et al. The importance of phase in speech enhancement , 2011, Speech Commun..
[45] Herman J. M. Steeneken,et al. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..
[46] Huy Phan,et al. Comparing time and frequency domain for audio event recognition using deep learning , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).
[47] DeLiang Wang,et al. A New Framework for CNN-Based Speech Enhancement in the Time Domain , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[48] Parham Aarabi,et al. On the importance of phase in human speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[49] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[50] Yu Tsao,et al. Raw waveform-based speech enhancement by fully convolutional networks , 2017, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[51] M. Portnoff. Time-frequency representation of digital signals and systems based on short-time Fourier analysis , 1980 .