Semantic Data Augmentation for End-to-End Mandarin Speech Recognition
暂无分享,去创建一个
Xiangang Li | Zhiyuan Tang | Shuaijiang Zhao | Wei Wang | Wei Zou | Jianwei Sun | Hengxin Yin | Xi Zhao | Xiaoning Lei | Xiangang Li | Wei Wang | Zhiyuan Tang | Wei Zou | Jianwei Sun | Shuaijiang Zhao | Xiaoning Lei | Hengxin Yin | Xi Zhao
[1] Kai Yu,et al. Speaker Augmentation for Low Resource Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Bhuvana Ramabhadran,et al. Speech Recognition with Augmented Synthesized Speech , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[3] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[4] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Sanjeev Khudanpur,et al. Audio augmentation for speech recognition , 2015, INTERSPEECH.
[6] Xiaofei Wang,et al. A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[7] Sunil Kumar Kopparapu,et al. Technique for automatic sentence level alignment of long speech and transcripts , 2013, INTERSPEECH.
[8] Hao Li,et al. Data Augmentation for end-to-end Code-Switching Speech Recognition , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[9] Mihai Surdeanu,et al. The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.
[10] John R. Kender,et al. Alignment of Speech to Highly Imperfect Text Transcriptions , 2007, 2007 IEEE International Conference on Multimedia and Expo.
[11] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[12] Claudia Ross,et al. Modern Mandarin Chinese Grammar: A Practical Guide , 2006 .
[13] Yongqiang Wang,et al. An investigation of deep neural networks for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[14] Linda G. Shapiro,et al. ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation , 2018, ECCV.
[15] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[16] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Hermann Ney,et al. Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Shuang Xu,et al. Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Hulya Yalcin,et al. Improving Low Resource Turkish Speech Recognition with Data Augmentation and TTS , 2019, 2019 16th International Multi-Conference on Systems, Signals & Devices (SSD).
[20] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[21] Tara N. Sainath,et al. Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home , 2017, INTERSPEECH.
[22] Shinji Watanabe,et al. Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Navdeep Jaitly,et al. Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[24] Pedro J. Moreno,et al. A recursive algorithm for the forced alignment of very long audio segments , 1998, ICSLP.
[25] Beat Pfister,et al. Text-to-speech alignment of long recordings using universal phone models , 2013, INTERSPEECH.
[26] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[27] 王葆华. 汉语语法化研究的新尝试 ——The Establishment of Modern Chinese Grammar: The Formation of the Resultative Construction and Its Effects 评介 , 2003 .
[28] Navdeep Jaitly,et al. Vocal Tract Length Perturbation (VTLP) improves speech recognition , 2013 .
[29] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.