The LeVoice Far-Field Speech Recognition System for VOiCES from a Distance Challenge 2019
暂无分享,去创建一个
Junjie Wang | Chen Jia | Lin Yang | Yingjie Li | Yulong Liang | Xuyang Wang
[1] Richard M. Stern,et al. Robust Speech Recognition Based on Binaural Auditory Processing , 2017, INTERSPEECH.
[2] Ke Li,et al. A Time-Restricted Self-Attention Layer for ASR , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Steve Renals,et al. Convolutional Neural Networks for Distant Speech Recognition , 2014, IEEE Signal Processing Letters.
[4] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[5] Colleen Richey,et al. The VOiCES from a Distance Challenge 2019 Evaluation Plan , 2019, ArXiv.
[6] Sanjeev Khudanpur,et al. A study on data augmentation of reverberant speech for robust speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Israel Cohen,et al. Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..
[8] Yiming Wang,et al. Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks , 2018, INTERSPEECH.
[9] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[10] Dmitry Popov,et al. An Investigation of Mixup Training Strategies for Acoustic Models in ASR , 2018, INTERSPEECH.
[11] Xiaohui Zhang,et al. Backstitch: Counteracting Finite-Sample Bias via Negative Steps , 2017, INTERSPEECH.
[12] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.
[13] Colleen Richey,et al. Voices Obscured in Complex Environmental Settings (VOICES) corpus , 2018, INTERSPEECH.
[14] Yonghong Yan,et al. Output-Gate Projected Gated Recurrent Unit for Speech Recognition , 2018, INTERSPEECH.
[15] Yuuki Tachioka,et al. Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Jonathan G. Fiscus,et al. A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[18] Richard M. Stern,et al. Robust speech recognition using temporal masking and thresholding algorithm , 2014, INTERSPEECH.
[19] Thomas Hain,et al. Recognition and understanding of meetings the AMI and AMIDA projects , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[20] Xiaofei Wang,et al. The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays , 2018 .
[21] Tomohiro Nakatani,et al. Generalization of Multi-Channel Linear Prediction Methods for Blind MIMO Impulse Response Shortening , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[22] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.