暂无分享,去创建一个
Khe Chai Sim | Zhouyuan Huo | Ananya Misra | Trevor Strohman | Franccoise Beaufays | David Qiu | Shefali Garg | Yanzhang He | Nikhil Siddhartha | Dongseong Hwang | Zhouyuan Huo | F. Beaufays | K. Sim | Trevor Strohman | Ananya Misra | Yanzhang He | Nikhil Siddhartha | Shefali Garg | David Qiu | DongSeon Hwang
[1] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Arun Narayanan,et al. Toward Domain-Invariant Speech Recognition via Large Scale Training , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[3] Arun Narayanan,et al. A Comparison of Supervised and Unsupervised Pre-Training of End-to-End Models , 2021, Interspeech.
[4] Quoc V. Le,et al. Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition , 2020, ArXiv.
[5] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[6] Yu-An Chung,et al. Generative Pre-Training for Speech with Autoregressive Predictive Coding , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[8] Tara N. Sainath,et al. Recognizing Long-Form Speech Using Streaming End-to-End Models , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[9] Quoc V. Le,et al. Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Khe Chai Sim,et al. Incremental Layer-wise Self-Supervised Learning for Efficient Speech Domain Adaptation On Device , 2021, ArXiv.
[11] Yoshua Bengio,et al. Multi-Task Self-Supervised Learning for Robust Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[13] Tara N. Sainath,et al. Learning Word-Level Confidence for Subword End-To-End ASR , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[15] Christian Fuegen,et al. Contrastive Semi-Supervised Learning for ASR , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Mike Schuster,et al. Japanese and Korean voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[18] Quoc V. Le,et al. Improved Noisy Student Training for Automatic Speech Recognition , 2020, INTERSPEECH.
[19] Tara N. Sainath,et al. A Better and Faster end-to-end Model for Streaming ASR , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[21] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[22] Armand Joulin,et al. Libri-Light: A Benchmark for ASR with Limited or No Supervision , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Hank Liao,et al. Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.