Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus
暂无分享,去创建一个
Tara N. Sainath | Ruoming Pang | James Apfel | Cal Peyser | Shankar Kumar | Sepand Mavandadi | Ruoming Pang | Cal Peyser | Shankar Kumar | Sepand Mavandadi | J. Apfel | S. Mavandadi
[1] Tara N. Sainath,et al. Two-Pass End-to-End Speech Recognition , 2019, INTERSPEECH.
[2] Tara N. Sainath,et al. Improving Proper Noun Recognition in End-To-End Asr by Customization of the Mwer Loss Criterion , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Chengzhu Yu,et al. Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition , 2019, INTERSPEECH.
[4] Navdeep Jaitly,et al. Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.
[5] Thorsten Brants,et al. One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.
[6] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[7] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[8] Adam Coates,et al. Cold Fusion: Training Seq2Seq Models Together with Language Models , 2017, INTERSPEECH.
[9] Steve Renals,et al. Hierarchical Bayesian Language Models for Conversational Speech Recognition , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[11] Mitch Weintraub,et al. Learning name pronunciations in automatic speech recognition systems , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.
[12] Tomi Kinnunen,et al. INTERSPEECH 2013 14thAnnual Conference of the International Speech Communication Association , 2013, Interspeech 2015.
[13] Hairong Liu,et al. Exploring neural transducers for end-to-end speech recognition , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[14] Tara N. Sainath,et al. An Analysis of Incorporating an External Language Model into a Sequence-to-Sequence Model , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Tara N. Sainath,et al. Minimum Word Error Rate Training for Attention-Based Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Shaohe Lv,et al. An Overview of End-to-End Automatic Speech Recognition , 2019, Symmetry.
[17] Arun Narayanan,et al. Toward Domain-Invariant Speech Recognition via Large Scale Training , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[18] Fadi Biadsy,et al. Effectively Building Tera Scale MaxEnt Language Models Incorporating Non-Linguistic Signals , 2017, INTERSPEECH.
[19] Erik McDermott,et al. A Deep Generative Acoustic Model for Compositional Automatic Speech Recognition , 2018 .
[20] Tara N. Sainath,et al. Streaming End-to-end Speech Recognition for Mobile Devices , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Alexander Gutkin,et al. Recent Advances in Google Real-Time HMM-Driven Unit Selection Synthesizer , 2016, INTERSPEECH.
[22] Yoshua Bengio,et al. On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.
[23] Tara N. Sainath,et al. A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[24] Matt Shannon,et al. Optimizing Expected Word Error Rate via Sampling for Speech Recognition , 2017, INTERSPEECH.
[25] Cyril Allauzen,et al. Language model verbalization for automatic speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[26] Ehsan Variani,et al. A Density Ratio Approach to Language Model Fusion in End-to-End Automatic Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[27] Paul Deléglise,et al. Acoustics-based phonetic transcription method for proper nouns , 2010, INTERSPEECH.
[28] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[29] Tara N. Sainath,et al. Deliberation Model Based Two-Pass End-To-End Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).