暂无分享,去创建一个
Zhengchen Zhang | Bowen Zhou | Youzheng Wu | Xiaodong He | Xiaoxiao Li | Li Fu | Runyu Wang | Xiaodong He | Bowen Zhou | Zhengchen Zhang | Youzheng Wu | Xiaoxiao Li | Li Fu | Runyu Wang
[1] Kyle Gorman,et al. Prosodylab-aligner: A tool for forced alignment of laboratory speech , 2011 .
[2] Quoc V. Le,et al. Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition , 2020, ArXiv.
[3] Kenneth Heafield,et al. KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.
[4] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[5] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[6] Alexei Baevski,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[7] Nut Limsopatham,et al. Using Phoneme Representations to Build Predictive Models Robust to ASR Errors , 2020, SIGIR.
[8] Abdel-rahman Mohamed,et al. Effectiveness of Self-Supervised Pre-Training for ASR , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Gabriel Synnaeve,et al. Joint Masked CPC and CTC Training for ASR , 2020, ArXiv.
[10] Bowen Zhou,et al. Incremental Learning for End-to-End Automatic Speech Recognition , 2020, 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[11] Hung-yi Lee,et al. Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[13] Chengyi Wang,et al. Semantic Mask for Transformer based End-to-End Speech Recognition , 2020, INTERSPEECH.
[14] Yonatan Belinkov,et al. Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems , 2017, NIPS.
[15] Hao Zheng,et al. AISHELL-1: An open-source Mandarin speech corpus and a speech recognition baseline , 2017, 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA).
[16] Xiangang Li,et al. Improving Transformer-based Speech Recognition Using Unsupervised Pre-training , 2019, ArXiv.
[17] Hao Tang,et al. An Unsupervised Autoregressive Model for Speech Representation Learning , 2019, INTERSPEECH.
[18] Yashesh Gaur,et al. On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition , 2020, INTERSPEECH.
[19] Ce Liu,et al. Supervised Contrastive Learning , 2020, NeurIPS.
[20] Tara N. Sainath,et al. A Comparison of Sequence-to-Sequence Models for Speech Recognition , 2017, INTERSPEECH.
[21] Furu Wei,et al. UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data , 2021, ICML.
[22] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[23] Tao Han,et al. Supervised Contrastive Learning for Accented Speech Recognition , 2021, ArXiv.
[24] Lei Xie,et al. WeNet: Production Oriented Streaming and Non-Streaming End-to-End Speech Recognition Toolkit , 2021, Interspeech.
[25] Alexei Baevski,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[26] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[27] Kenneth Ward Church,et al. Decoupling Recognition and Transcription in Mandarin ASR , 2021, 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[28] Kuan-Yu Chen,et al. Non-autoregressive Transformer-based End-to-end ASR using BERT , 2021, ArXiv.