Weak-Attention Suppression For Transformer Based Speech Recognition
暂无分享,去创建一个
Yongqiang Wang | Chunyang Wu | Christian Fuegen | Michael L. Seltzer | Frank Zhang | Yangyang Shi | Ching-Feng Yeh | Duc Le | M. Seltzer | Duc Le | Christian Fuegen | Ching-feng Yeh | Yongqiang Wang | Chunyang Wu | Frank Zhang | Yangyang Shi
[1] Yongqiang Wang,et al. Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory , 2020, INTERSPEECH.
[2] Luke S. Zettlemoyer,et al. Transformers with convolutional context for ASR , 2019, ArXiv.
[3] Qian Zhang,et al. Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[5] Zhiheng Huang,et al. Self-attention Networks for Connectionist Temporal Classification in Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[7] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[8] Geoffrey Zweig,et al. Transformer-Based Acoustic Modeling for Hybrid Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Ramón Fernández Astudillo,et al. From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification , 2016, ICML.
[10] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[11] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] André F. T. Martins,et al. Adaptively Sparse Transformers , 2019, EMNLP.
[13] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[14] Ke Li,et al. A Time-Restricted Self-Attention Layer for ASR , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Kjell Schubert,et al. Transformer-Transducer: End-to-End Speech Recognition with Self-Attention , 2019, ArXiv.
[16] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[17] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[18] Matthias Sperber,et al. Self-Attentional Acoustic Models , 2018, INTERSPEECH.
[19] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[20] Shuang Xu,et al. Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Shuang Xu,et al. Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese , 2018, INTERSPEECH.
[22] Geoffrey Zweig,et al. DEJA-VU: Double Feature Presentation and Iterated Loss in Deep Transformer Networks , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[24] Xiaofei Wang,et al. A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[25] Sanjeev Khudanpur,et al. Audio augmentation for speech recognition , 2015, INTERSPEECH.
[26] Geoffrey Zweig,et al. From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).