Online LSTM-based Iterative Mask Estimation for Multi-Channel Speech Enhancement and ASR
暂无分享,去创建一个
Jun Du | Chin-Hui Lee | Nan Zhou | Yanhui Tu | Chin-Hui Lee | Jun Du | Yanhui Tu | N. Zhou
[1] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[2] Jon Barker,et al. The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[3] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[4] Emmanuel Vincent,et al. Multichannel Audio Source Separation With Deep Neural Networks , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[5] Reinhold Häb-Umbach,et al. BLSTM supported GEV beamformer front-end for the 3RD CHiME challenge , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[6] Tomohiro Nakatani,et al. Online MVDR Beamformer Based on Complex Gaussian Mixture Model With Spatial Prior for Noise Robust ASR , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Björn W. Schuller,et al. Discriminatively trained recurrent neural networks for single-channel speech separation , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[8] Jun Du,et al. Speech Separation based on signal-noise-dependent deep neural networks for robust speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Tara N. Sainath,et al. Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[10] Rémi Gribonval,et al. Oracle estimators for the benchmarking of source separation algorithms , 2007, Signal Process..
[11] Jon Barker,et al. An analysis of environment, microphone and data simulation mismatches in robust speech recognition , 2017, Comput. Speech Lang..
[12] Takuya Yoshioka,et al. Robust MVDR beamforming using time-frequency masks for online/offline ASR in noise , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Walter Kellermann,et al. A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics , 2005, IEEE Transactions on Speech and Audio Processing.
[14] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[15] Jun Du,et al. LSTM-based iterative mask estimation and post-processing for multi-channel speech enhancement , 2017, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[16] Jun Du,et al. Joint training of front-end and back-end deep neural networks for robust speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Jun Du,et al. On Design of Robust Deep Models for CHiME-4 Multi-Channel Speech Recognition with Multiple Configurations of Array Microphones , 2017, INTERSPEECH.
[18] Reinhold Häb-Umbach,et al. Speech Enhancement With a GSC-Like Structure Employing Eigenvector-Based Transfer Function Ratios Estimation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Marc Moonen,et al. Performance Analysis of Multichannel Wiener Filter-Based Noise Reduction in Hearing Aids Under Second Order Statistics Estimation Errors , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[20] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Björn W. Schuller,et al. Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR , 2015, LVA/ICA.
[22] B.D. Van Veen,et al. Beamforming: a versatile approach to spatial filtering , 1988, IEEE ASSP Magazine.
[23] Jun Du,et al. Joint training of DNNs by incorporating an explicit dereverberation structure for distant speech recognition , 2016, EURASIP J. Adv. Signal Process..
[24] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[25] John R. Hershey,et al. Unified Architecture for Multichannel End-to-End Speech Recognition With Neural Beamforming , 2017, IEEE Journal of Selected Topics in Signal Processing.
[26] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.