Self-Supervised End-to-End ASR for Low Resource L2 Swedish
暂无分享,去创建一个
Mikko Kurimo | Aku Rouhe | Raili Hildén | Ragheb Al-Ghezi | Yaroslav Getman | M. Kurimo | Ragheb Al-Ghezi | Aku Rouhe | R. Hildén | Yaroslav Getman
[1] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[2] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Gabriel Synnaeve,et al. MLS: A Large-Scale Multilingual Dataset for Speech Research , 2020, INTERSPEECH.
[4] Alexei Baevski,et al. Effectiveness of self-supervised pre-training for speech recognition , 2019, ArXiv.
[5] Andreas Stolcke,et al. The Microsoft 2017 Conversational Speech Recognition System , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[7] Brian Kingsbury,et al. Multilingual representations for low resource speech recognition and keyword search , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[8] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[9] Martti Vainio,et al. Developing a high-stake digital spoken language proficiency assessment: Results from pilot tests , 2016 .
[10] Mark J. F. Gales,et al. Data augmentation for low resource languages , 2014, INTERSPEECH.
[11] Seongjin Park,et al. A comparison between native and non-native speech for automatic speech recognition , 2019, The Journal of the Acoustical Society of America.
[12] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[13] Xiaodong Cui,et al. Data Augmentation for Deep Neural Network Acoustic Modeling , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[14] Francis M. Tyers,et al. Common Voice: A Massively-Multilingual Speech Corpus , 2020, LREC.
[15] Tatsuya Kawahara,et al. Cross-Lingual Transfer Learning of Non-Native Acoustic Modeling for Pronunciation Error Detection and Diagnosis , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Stefan Schaden. Generating Non - Native Pronuncia - tion Lexicons by Phonological Rule , 2003 .
[17] Hung-yi Lee,et al. Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Yuan Gao,et al. Spoken English Intelligibility Remediation with Pocketsphinx Alignment and Feature Extraction Improves Substantially Over the State of the Art , 2017, 2018 2nd IEEE Advanced Information Management,Communicates,Electronic and Automation Control Conference (IMCEC).
[19] George Saon,et al. The IBM 2015 English conversational telephone speech recognition system , 2015, INTERSPEECH.
[20] John P. McCrae,et al. A Survey of Current Datasets for Code-Switching Research , 2020, 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS).
[21] Emmanuel Dupoux,et al. VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation , 2021, ACL.
[22] Diego Giuliani,et al. Non-Native Children Speech Recognition Through Transfer Learning , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[24] Ronan Collobert,et al. Unsupervised Cross-lingual Representation Learning for Speech Recognition , 2020, Interspeech.
[25] Avni Rajpal,et al. Pseudo Likelihood Correction Technique for Low Resource Accented ASR , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Geoffrey Zweig,et al. Training ASR Models By Generation of Contextual Information , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[28] Alexei Baevski,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[29] Mark J. F. Gales,et al. Speech recognition and keyword spotting for low-resource languages: Babel project research at CUED , 2014, SLTU.