Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
暂无分享,去创建一个
[1] Jong Wook Kim,et al. Robust Speech Recognition via Large-Scale Weak Supervision , 2022, ICML.
[2] Y. Qian,et al. Knowledge Transfer and Distillation from Autoregressive to Non-Autoregessive Speech Recognition , 2022, INTERSPEECH.
[3] Joon‐Hyuk Chang,et al. W2V2-Light: A Lightweight Version of Wav2vec 2.0 for Automatic Speech Recognition , 2022, INTERSPEECH.
[4] Danqi Chen,et al. Structured Pruning Learns Compact and Accurate Models , 2022, Annual Meeting of the Association for Computational Linguistics.
[5] Rui Wang,et al. LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT , 2022, INTERSPEECH.
[6] Hung-yi Lee,et al. Distilhubert: Speech Representation Learning by Layer-Wise Distillation of Hidden-Unit Bert , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Alexander M. Rush,et al. Block Pruning For Faster Transformers , 2021, EMNLP.
[8] Awni Hannun,et al. The History of Speech Recognition to the Year 2030 , 2021, ArXiv.
[9] Siwei Ma,et al. Post-Training Quantization for Vision Transformer , 2021, NeurIPS.
[10] Ruslan Salakhutdinov,et al. HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[11] Abdel-rahman Mohamed,et al. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations , 2020, NeurIPS.
[12] Preslav Nakov,et al. Poor Man's BERT: Smaller and Faster Transformer Models , 2020, ArXiv.
[13] Ziheng Wang,et al. Structured Pruning of Large Language Models , 2019, EMNLP.
[14] Wei Wang,et al. Additive Powers-of-Two Quantization: An Efficient Non-uniform Discretization for Neural Networks , 2019, International Conference on Learning Representations.
[15] Xin Jiang,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2019, FINDINGS.
[16] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[17] J. Z. Kolter,et al. Deep Equilibrium Models , 2019, NeurIPS.
[18] Fedor Moiseev,et al. Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned , 2019, ACL.
[19] Omer Levy,et al. Are Sixteen Heads Really Better than One? , 2019, NeurIPS.
[20] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[21] Ronan Collobert,et al. wav2vec: Unsupervised Pre-training for Speech Recognition , 2019, INTERSPEECH.
[22] Mingjie Sun,et al. Rethinking the Value of Network Pruning , 2018, ICLR.
[23] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[24] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[25] G. Hua,et al. LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks , 2018, ECCV.
[26] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[27] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .
[28] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.
[29] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[30] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[31] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[32] Ming Yang,et al. Compressing Deep Convolutional Networks using Vector Quantization , 2014, ArXiv.
[33] K. Maekawa. CORPUS OF SPONTANEOUS JAPANESE : ITS DESIGN AND EVALUATION , 2003 .