HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition
暂无分享,去创建一个
[1] Kyu J. Han,et al. E-Branchformer: Branchformer with Enhanced Merging for Speech Recognition , 2022, 2022 IEEE Spoken Language Technology Workshop (SLT).
[2] I. Lane,et al. Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding , 2022, ICML.
[3] Nicholas D. Lane,et al. Federated Self-supervised Speech Representations: Are We There Yet? , 2022, INTERSPEECH.
[4] F. Fleuret,et al. HyperMixer: An MLP-based Low Cost Alternative to Transformers , 2022, 2203.03691.
[5] Robin Scheibler,et al. MLP-ASR: Sequence-length agnostic all-MLP architectures for speech recognition , 2022, ArXiv.
[6] Wenhao Jiang,et al. DynaMixer: A Vision MLP Architecture with Dynamic Mixing , 2022, ICML.
[7] Juan Pino,et al. XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale , 2021, INTERSPEECH.
[8] Ping Luo,et al. CycleMLP: A MLP-like Architecture for Dense Prediction , 2021, ICLR.
[9] Titouan Parcollet,et al. SpeechBrain: A General-Purpose Speech Toolkit , 2021, ArXiv.
[10] Quoc V. Le,et al. Pay Attention to MLPs , 2021, NeurIPS.
[11] A. Dosovitskiy,et al. MLP-Mixer: An all-MLP Architecture for Vision , 2021, NeurIPS.
[12] Titouan Parcollet,et al. The Energy and Carbon Footprint of Training End-to-End Speech Recognizers , 2021, Interspeech.
[13] Yi Tay,et al. Efficient Transformers: A Survey , 2020, ACM Comput. Surv..
[14] Nikolaos Pappas,et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention , 2020, ICML.
[15] Han Fang,et al. Linformer: Self-Attention with Linear Complexity , 2020, ArXiv.
[16] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[17] Xiaofei Wang,et al. A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[18] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[19] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[20] Khaled Shaalan,et al. Speech Recognition Using Deep Neural Networks: A Systematic Review , 2019, IEEE Access.
[21] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[22] John R. Hershey,et al. Hybrid CTC/Attention Architecture for End-to-End Speech Recognition , 2017, IEEE Journal of Selected Topics in Signal Processing.
[23] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[24] Quoc V. Le,et al. HyperNetworks , 2016, ICLR.
[25] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[26] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016, 1606.08415.
[27] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[28] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[30] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[31] Alex Graves,et al. Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.
[32] Alex Graves,et al. Connectionist Temporal Classification , 2012 .