Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
暂无分享,去创建一个
Daniel Povey | Piotr Żelasko | Liyong Guo | Fan Cui | Quandong Wang | Zengwei Yao | Fangjun Kuang | Xiaoyu Yang | Yuxiang Kong | Mingshuang Luo | Wei Kang | Long Lin
[1] Daniel Povey,et al. Pruned RNN-T for fast, memory-efficient ASR training , 2022, INTERSPEECH.
[2] Yonghui Wu,et al. Self-supervised Learning with Random-projection Quantizer for Speech Recognition , 2022, ICML.
[3] Jinyu Li,et al. WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing , 2021, IEEE Journal of Selected Topics in Signal Processing.
[4] P. Woodland,et al. Knowledge Distillation for Neural Transducers from Large Self-Supervised Pre-Trained Models , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Li Dong,et al. BEiT: BERT Pre-Training of Image Transformers , 2021, ICLR.
[6] A. Heba,et al. A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding , 2021, ArXiv.
[7] Ruslan Salakhutdinov,et al. HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[8] Athanasios Mouchtaris,et al. CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition , 2021, Interspeech.
[9] Luciana Ferrer,et al. Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings , 2021, Interspeech.
[10] Daniel S. Park,et al. Efficient Knowledge Distillation for RNN-Transducer Models , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Arun Narayanan,et al. Improving Streaming Automatic Speech Recognition with Non-Streaming Model Distillation on Unsupervised Data , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Lei Xie,et al. Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition , 2020, ArXiv.
[13] George Saon,et al. Knowledge Distillation from Offline to Streaming RNN Transducer for End-to-End Speech Recognition , 2020, INTERSPEECH.
[14] Quoc V. Le,et al. Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition , 2020, ArXiv.
[15] Abdel-rahman Mohamed,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[16] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[17] Michael Auli,et al. vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations , 2019, ICLR.
[18] Yashesh Gaur,et al. Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[19] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[20] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[21] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[22] Sanjeev Khudanpur,et al. A Teacher-Student Learning Approach for Unsupervised Domain Adaptation of Sequence-Trained ASR Models , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[23] Naoyuki Kanda,et al. Sequence Distillation for Purely Sequence Trained Acoustic Models , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Morgan Sonderegger,et al. Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi , 2017, INTERSPEECH.
[25] Daniel Povey,et al. MUSAN: A Music, Speech, and Noise Corpus , 2015, ArXiv.
[26] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[28] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[29] Yoshua Bengio,et al. Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.
[30] Syed A. Rizvi,et al. Advances in residual vector quantization: a review , 1996, IEEE Trans. Image Process..
[31] Christopher F. Barnes,et al. Embedded wavelet zerotree coding with direct sum quantization structures , 1995, Proceedings DCC '95 Data Compression Conference.
[32] Richard L. Frost,et al. Vector quantizers with direct sum codebooks , 1993, IEEE Trans. Inf. Theory.
[33] Thomas M. Cover,et al. Elements of Information Theory , 2005 .