Bi-Encoder Transformer Network for Mandarin-English Code-Switching Speech Recognition Using Mixture of Experts
暂无分享,去创建一个
Hao Li | Yanmin Qian | Yizhou Lu | Mingkun Huang | Jiaqi Guo | Y. Qian | Hao Li | Yizhou Lu | Mingkun Huang | Jiaqi Guo
[1] Lori Lamel,et al. Addressing Code-Switching in French/Algerian Arabic Speech , 2017, INTERSPEECH.
[2] Yifan Gong,et al. Universal Acoustic Modeling Using Neural Mixture Models , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Tara N. Sainath,et al. Bytes Are All You Need: End-to-end Multilingual Speech Recognition and Synthesis with Bytes , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Jonathan Le Roux,et al. MIMO-Speech: End-to-End Multi-Channel Multi-Speaker Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[6] Srinivasan Umesh,et al. Investigation of Methods to Improve the Recognition Performance of Tamil-English Code-Switched Data in Transformer Framework , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Shinji Watanabe,et al. ESPnet: End-to-End Speech Processing Toolkit , 2018, INTERSPEECH.
[8] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[9] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[10] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[11] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Haizhou Li,et al. On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition , 2018, INTERSPEECH.
[13] Yu Zhang,et al. Advances in Joint CTC-Attention Based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM , 2017, INTERSPEECH.
[14] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Suyoun Kim,et al. Towards Language-Universal End-to-End Speech Recognition , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[17] Taku Kudo,et al. Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates , 2018, ACL.
[18] Kai Yu,et al. Cluster adaptive training for deep neural network , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Yifan Gong,et al. Towards Code-switching ASR for End-to-end CTC Models , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Shakti P. Rath,et al. A Multi-Accent Acoustic Model Using Mixture of Experts for Speech Recognition , 2019, INTERSPEECH.
[21] Dong Yu,et al. Investigating End-to-end Speech Recognition for Mandarin-english Code-switching , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Shinji Watanabe,et al. Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Tara N. Sainath,et al. Multilingual Speech Recognition with a Single End-to-End Model , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[25] Haizhou Li,et al. A first speech recognition system for Mandarin-English code-switch conversational speech , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Shinji Watanabe,et al. Improving Transformer-Based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration , 2019, INTERSPEECH.
[27] Tara N. Sainath,et al. Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model , 2019, INTERSPEECH.
[28] Shuai Zhang,et al. Rnn-transducer With Language Bias For End-to-end Mandarin-English Code-switching Speech Recognition , 2020, 2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP).
[29] Thomas Niesler,et al. Building a Unified Code-Switching ASR System for South African Languages , 2018, INTERSPEECH.
[30] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[31] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] Xiaofei Wang,et al. A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[33] Shankar Kumar,et al. RADMM: Recurrent Adaptive Mixture Model with Applications to Domain Robust Language Modeling , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).