暂无分享,去创建一个
[1] Aren Jansen,et al. Towards Learning a Universal Non-Semantic Representation of Speech , 2020, INTERSPEECH.
[2] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[3] Rafael E. Banchs,et al. Automatic Correction of ASR Outputs by Using Machine Translation , 2016, INTERSPEECH.
[4] Shruti Palaskar,et al. ASR Error Correction and Domain Adaptation Using Machine Translation , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[6] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[7] Tara N. Sainath,et al. Streaming End-to-end Speech Recognition for Mobile Devices , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Tara N. Sainath,et al. A Spelling Correction Model for End-to-end Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Hermann Ney,et al. Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..
[10] Julius Kunze,et al. Transfer Learning for Speech Recognition on a Budget , 2017, Rep4NLP@ACL.
[11] Shinji Watanabe,et al. Multilingual Sequence-to-Sequence Speech Recognition: Architecture, Transfer Learning, and Language Modeling , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[12] Tara N. Sainath,et al. Lower Frame Rate Neural Network Acoustic Models , 2016, INTERSPEECH.
[13] Tara N. Sainath,et al. Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model , 2019, INTERSPEECH.
[14] Srikanth Ronanki,et al. In Other News: a Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data , 2019, NAACL.
[15] Daniel Willett,et al. Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems , 2020, ArXiv.
[16] Bhuvana Ramabhadran,et al. Speech Recognition with Augmented Synthesized Speech , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[17] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Hermann Ney,et al. Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Vikas Joshi,et al. Transfer Learning Approaches for Streaming End-to-End Speech Recognition System , 2020, INTERSPEECH.
[20] Josef R. Novak,et al. Phonetisaurus: Exploring grapheme-to-phoneme conversion with joint n-gram models in the WFST framework , 2015, Natural Language Engineering.
[21] Gabriel Synnaeve,et al. Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters , 2020, INTERSPEECH.
[22] Grzegorz Kondrak,et al. Applying Many-to-Many Alignments and Hidden Markov Models to Letter-to-Phoneme Conversion , 2007, NAACL.
[23] Tatsuya Kawahara,et al. Leveraging Sequence-to-Sequence Speech Synthesis for Enhancing Acoustic-to-Word Speech Recognition , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[24] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[25] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).