暂无分享,去创建一个
Zejun Ma | Shaoshi Ling | Chen Shen | Meng Cai | Shaoshi Ling | Chen Shen | Zejun Ma | Meng Cai
[1] Yuzong Liu,et al. DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization , 2020, ArXiv.
[2] Jae-Gil Lee,et al. Learning from Noisy Labels with Deep Neural Networks: A Survey , 2020, ArXiv.
[3] Sanjeev Khudanpur,et al. Semi-Supervised Training of Acoustic Models Using Lattice-Free MMI , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Yongqiang Wang,et al. Semi-Supervised Training in Deep Learning Acoustic Model , 2016, INTERSPEECH.
[5] Kenneth Ward Church,et al. Deep neural network features and semi-supervised training for low resource speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[6] Hermann Ney,et al. RWTH ASR Systems for LibriSpeech: Hybrid vs Attention - w/o Data Augmentation , 2019, INTERSPEECH.
[7] Matthijs Douze,et al. Lead2Gold: Towards Exploiting the Full Potential of Noisy Transcriptions for Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[8] M. Verleysen,et al. Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[9] Gabriel Synnaeve,et al. Iterative Pseudo-Labeling for Speech Recognition , 2020, INTERSPEECH.
[10] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[11] Majid Mirbagheri,et al. ASR for Under-Resourced Languages From Probabilistic Transcription , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[12] Gabriel Synnaeve,et al. slimIPL: Language-Model-Free Iterative Pseudo-Labeling , 2020, Interspeech.
[13] Awni Hannun,et al. Self-Training for End-to-End Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Ruslan Salakhutdinov,et al. HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[15] Hao Tang,et al. An Unsupervised Autoregressive Model for Speech Representation Learning , 2019, INTERSPEECH.
[16] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[17] Shang-Wen Li,et al. TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[18] Tomoharu Iwata,et al. Semi-Supervised End-to-End Speech Recognition , 2018, INTERSPEECH.
[19] Furu Wei,et al. UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data , 2021, ICML.
[20] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[21] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Gabriel Synnaeve,et al. Joint Masked CPC and CTC Training for ASR , 2020, ArXiv.
[23] Furu Wei,et al. UniSpeech at scale: An Empirical Study of Pre-training Method on Large-Scale Speech Recognition Dataset , 2021, 2107.05233.
[24] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[26] Sree Hari Krishnan Parthasarathi,et al. Lessons from Building Acoustic Models with a Million Hours of Speech , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Quoc V. Le,et al. Improved Noisy Student Training for Automatic Speech Recognition , 2020, INTERSPEECH.
[28] Hung-yi Lee,et al. Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[30] Yuzong Liu,et al. Deep Contextualized Acoustic Representations for Semi-Supervised Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Yuzong Liu,et al. BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition , 2019, Odyssey.
[32] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[33] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[34] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .