Speech Pre-training with Acoustic Piece
暂无分享,去创建一个
Furu Wei | Shuo Ren | Yu Wu | Shujie Liu | Long Zhou
[1] Jinyu Li,et al. WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing , 2021, IEEE Journal of Selected Topics in Signal Processing.
[2] Li Dong,et al. XLM-E: Cross-lingual Language Model Pre-training via ELECTRA , 2021, ACL.
[3] Ruslan Salakhutdinov,et al. HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[4] Xiangang Li,et al. GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10, 000 Hours of Transcribed Audio , 2021, Interspeech.
[5] Emmanuel Dupoux,et al. VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation , 2021, ACL.
[6] Yuzong Liu,et al. DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization , 2020, ArXiv.
[7] Abdel-rahman Mohamed,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[8] Xiangang Li,et al. A Further Study of Unsupervised Pretraining for Transformer Based Speech Recognition , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Abdel-rahman Mohamed,et al. Effectiveness of Self-Supervised Pre-Training for ASR , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Abdel-rahman Mohamed,et al. Libri-Light: A Benchmark for ASR with Limited or No Supervision , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] James R. Glass,et al. Generative Pre-Training for Speech with Autoregressive Predictive Coding , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Edouard Grave,et al. Reducing Transformer Depth on Demand with Structured Dropout , 2019, ICLR.
[13] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[14] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.
[15] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.
[16] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[18] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[19] ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2021 .