Speech Enhancement Using End-to-End Speech Recognition Objectives
暂无分享,去创建一个
Shinji Watanabe | Toru Taniguchi | Murali Karthick Baskar | Dung T. Tran | Xiaofei Wang | Aswin Shanmugam Subramanian | Dung Tran | Yuya Fujita | Shinji Watanabe | Xiaofei Wang | A. Subramanian | Yuya Fujita | T. Taniguchi | M. Baskar
[1] Jesper Jensen,et al. Monaural Speech Enhancement Using Deep Neural Networks by Maximizing a Short-Time Objective Intelligibility Measure , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Jon Barker,et al. The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines , 2018, INTERSPEECH.
[3] Biing-Hwang Juang,et al. Speech Dereverberation Based on Variance-Normalized Delayed Linear Prediction , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[4] Xiaofei Wang,et al. An Investigation of End-to-End Multichannel Speech Recognition for Reverberant and Mismatch Conditions , 2019 .
[5] Reinhold Häb-Umbach,et al. Beamnet: End-to-end training of a beamformer-supported multi-channel ASR system , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Richard M. Stern,et al. Microphone array processing for robust speech recognition , 2003 .
[7] Shigeru Katagiri,et al. Does speech enhancement work with end-to-end ASR objectives?: Experimental analysis of multichannel end-to-end ASR , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).
[8] Jacob Benesty,et al. On Optimal Frequency-Domain Multichannel Linear Filtering for Noise Reduction , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Shinji Watanabe,et al. End-to-end Speech Recognition With Word-Based Rnn Language Models , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[10] Tomohiro Nakatani,et al. Neural Network-Based Spectrum Estimation for Online WPE Dereverberation , 2017, INTERSPEECH.
[11] Jonathan Le Roux,et al. Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Erin M Picou,et al. The Effects of Noise and Reverberation on Listening Effort in Adults With Normal Hearing , 2016, Ear and hearing.
[13] Paris Smaragdis,et al. Experiments on deep learning for speech denoising , 2014, INTERSPEECH.
[14] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[15] Yu Tsao,et al. Speech enhancement based on deep denoising autoencoder , 2013, INTERSPEECH.
[16] John R. Hershey,et al. Multichannel End-to-end Speech Recognition , 2017, ICML.
[17] Tomohiro Nakatani,et al. Generalization of Multi-Channel Linear Prediction Methods for Blind MIMO Impulse Response Shortening , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[18] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[19] Shinji Watanabe,et al. ESPnet: End-to-End Speech Processing Toolkit , 2018, INTERSPEECH.
[20] Emmanuel Vincent,et al. A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[21] Jun Du,et al. Joint training of front-end and back-end deep neural networks for robust speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] R. Maas,et al. A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research , 2016, EURASIP Journal on Advances in Signal Processing.
[23] Xavier Anguera Miró,et al. Acoustic Beamforming for Speaker Diarization of Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[24] DeLiang Wang,et al. On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[25] Reinhold Haeb-Umbach,et al. NARA-WPE: A Python package for weighted prediction error dereverberation in Numpy and Tensorflow for online and offline processing , 2018, ITG Symposium on Speech Communication.
[26] DeLiang Wang,et al. Joint noise adaptive training for robust automatic speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Maurizio Omologo,et al. The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[28] Yi Hu,et al. Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[29] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[30] Yong Xu,et al. Joint Training of Complex Ratio Mask Based Beamformer and Acoustic Model for Noise Robust Asr , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).