Batch-normalized joint training for DNN-based distant speech recognition
暂无分享,去创建一个
Yoshua Bengio | Maurizio Omologo | Mirco Ravanelli | Philemon Brakel | Yoshua Bengio | Philemon Brakel | M. Ravanelli | M. Omologo
[1] Richard M. Stern,et al. Likelihood-maximizing beamforming for robust hands-free speech recognition , 2004, IEEE Transactions on Speech and Audio Processing.
[2] Jon Barker,et al. The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[3] Andrey Temko,et al. Acoustic Event Detection and Classification , 2007, Computers in the Human Interaction Loop.
[4] Tomohiro Nakatani,et al. The reverb challenge: A common evaluation framework for dereverberation and recognition of reverberant speech , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[5] John R. Hershey,et al. Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks , 2015, INTERSPEECH.
[6] Maurizio Omologo,et al. Realistic Multi-Microphone Data Simulation for Distant Speech Recognition , 2016, INTERSPEECH.
[7] Maurizio Omologo,et al. Impulse response estimation for robust speech recognition in a reverberant environment , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).
[8] Petros Maragos,et al. The DIRHA simulated corpus , 2014, LREC.
[9] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[10] Thomas Hain,et al. Using neural network front-ends on far field multiple microphones based speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[12] Ning Ma,et al. The PASCAL CHiME speech separation and recognition challenge , 2013, Comput. Speech Lang..
[13] Björn W. Schuller,et al. Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR , 2015, LVA/ICA.
[14] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[15] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Roland Maas,et al. Spatial diffuseness features for DNN-based speech recognition in noisy and reverberant environments , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Jun Du,et al. An Experimental Study on Speech Enhancement Based on Deep Neural Networks , 2014, IEEE Signal Processing Letters.
[18] Yoshua Bengio,et al. Knowledge Matters: Importance of Prior Information for Optimization , 2013, J. Mach. Learn. Res..
[19] Tatsuya Kawahara,et al. Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature , 2015, EURASIP J. Adv. Signal Process..
[20] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[21] Gerald Friedland,et al. Audio concept classification with Hierarchical Deep Neural Networks , 2014, 2014 22nd European Signal Processing Conference (EUSIPCO).
[22] Alessio Brutti,et al. A speech event detection and localization task for multiroom environments , 2014, 2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA).
[23] Tatsuya Kawahara,et al. Joint Optimization of Denoising Autoencoder and DNN Acoustic Model Based on Multi-Target Learning for Noisy Speech Recognition , 2016, INTERSPEECH.
[24] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[25] Chengzhu Yu,et al. The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[26] Zhong-Qiu Wang,et al. A Joint Training Framework for Robust Automatic Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[27] H. Bourlard,et al. Interpretation of Multiparty Meetings the AMI and Amida Projects , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.
[28] Alex Acero,et al. Joint Discriminative Front End and Back End Training for Improved Speech Recognition Accuracy , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[29] Steve Renals,et al. Hybrid acoustic models for distant and multichannel large vocabulary speech recognition , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[30] Tomohiro Nakatani,et al. Optimization of Speech Enhancement Front-End with Speech Recognition-Level Criterion , 2016, INTERSPEECH.
[31] Ulrike Goldschmidt. Speech And Audio Processing In Adverse Environments , 2016 .
[32] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[33] Arun Ross,et al. Microphone Arrays , 2009, Encyclopedia of Biometrics.
[34] Jun Du,et al. Joint training of front-end and back-end deep neural networks for robust speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Aaron C. Courville,et al. Recurrent Batch Normalization , 2016, ICLR.
[36] Te-Won Lee,et al. Blind Speech Separation , 2007, Blind Speech Separation.
[37] Maurizio Omologo. A prototype of distant-talking interface for control of interactive TV , 2010, 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers.
[38] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[39] Tara N. Sainath,et al. Speaker location and microphone spacing invariant acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[40] Maurizio Omologo,et al. Contaminated speech training methods for robust DNN-HMM distant speech recognition , 2017, INTERSPEECH.
[41] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.
[42] DeLiang Wang,et al. Joint noise adaptive training for robust automatic speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[43] Maurizio Omologo,et al. The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[44] Maurizio Omologo,et al. On the selection of the impulse responses for distant-speech recognition based on contaminated speech training , 2014, INTERSPEECH.
[45] References , 1971 .
[46] Maurizio Omologo,et al. Acoustic event localization using a crosspower-spectrum phase based technique , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[47] Figen Ertaş,et al. FUNDAMENTALS OF SPEAKER RECOGNITION , 2011 .
[48] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[49] Yuuki Tachioka,et al. The MERL/MELCO/TUM system for the REVERB Challenge using Deep Recurrent Neural Network Feature Enhancement , 2014, ICASSP 2014.
[50] François Laviolette,et al. Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..