暂无分享,去创建一个
Ozlem Kalinli | Duc Le | Chunyang Wu | Jay Mahadeokar | Christian Fuegen | Michael L. Seltzer | Yangyang Shi | Yuan Shangguan | Alex Xiao | Hang Su | M. Seltzer | Hang Su | Ozlem Kalinli | Duc Le | Jay Mahadeokar | Christian Fuegen | Chunyang Wu | Yuan Shangguan | Alex Xiao | Yangyang Shi
[1] Liang Qiao,et al. Optimizing Speech Recognition For The Edge , 2019, ArXiv.
[2] Tara N. Sainath,et al. Towards Fast and Accurate Streaming End-To-End ASR , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Xiaofei Wang,et al. A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[4] Geoffrey Zweig,et al. Transformer-Based Acoustic Modeling for Hybrid Speech Recognition , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Shuang Xu,et al. Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] G. Zweig,et al. Fast, Simpler and More Accurate Hybrid ASR Systems Using Wordpieces , 2020, INTERSPEECH.
[7] Yongqiang Wang,et al. Weak-Attention Suppression For Transformer Based Speech Recognition , 2020, INTERSPEECH.
[8] Qian Zhang,et al. Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition , 2020, ArXiv.
[9] Frank Zhang,et al. Transformer in Action: A Comparative Study of Transformer-Based Acoustic Models for Large Scale Speech Recognition Applications , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Matthias Sperber,et al. Self-Attentional Acoustic Models , 2018, INTERSPEECH.
[11] Lei Xie,et al. WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit , 2021, ArXiv.
[12] Jun Zhang,et al. Dynamic latency speech recognition with asynchronous revision , 2020, ArXiv.
[13] Ke Li,et al. A Time-Restricted Self-Attention Layer for ASR , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Gil Keren,et al. Alignment Restricted Streaming Recurrent Neural Network Transducer , 2021, 2021 IEEE Spoken Language Technology Workshop (SLT).
[15] Daehyun Kim,et al. Attention Based On-Device Streaming Speech Recognition with Large Speech Corpus , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[16] Shuang Xu,et al. Multilingual End-to-End Speech Recognition with A Single Transformer on Low-Resource Languages , 2018, ArXiv.
[17] Matt Shannon,et al. Improved End-of-Query Detection for Streaming Speech Recognition , 2017, INTERSPEECH.
[18] Tara N. Sainath,et al. FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Bo Xu,et al. Self-attention Aligner: A Latency-control End-to-end Model for ASR Using Self-attention Network and Chunk-hopping , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Kjell Schubert,et al. Transformer-Transducer: End-to-End Speech Recognition with Self-Attention , 2019, ArXiv.
[21] Andrew W. Senior,et al. Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.
[22] Tara N. Sainath,et al. Multi-Dialect Speech Recognition with a Single Sequence-to-Sequence Model , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Shuang Xu,et al. Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese , 2018, INTERSPEECH.
[24] Tara N. Sainath,et al. Joint Endpointing and Decoding with End-to-end Models , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Tara N. Sainath,et al. Streaming End-to-end Speech Recognition for Mobile Devices , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Tara N. Sainath,et al. Cascaded Encoders for Unifying Streaming and Non-Streaming ASR , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] M. Seltzer,et al. Memory-Efficient Speech Recognition on Smart Devices , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Andreas Stolcke,et al. Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition , 2020, INTERSPEECH.
[29] Frank Zhang,et al. Streaming Attention-Based Models with Augmented Memory for End-to-End Speech Recognition , 2020, ArXiv.
[30] Chengyi Wang,et al. Low Latency End-to-End Streaming Speech Recognition with a Scout Network , 2020, INTERSPEECH.
[31] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[32] Frank Zhang,et al. Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition , 2020, ArXiv.
[33] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[34] Tara N. Sainath,et al. Emitting Word Timings with End-to-End Models , 2020, INTERSPEECH.
[35] Tara N. Sainath,et al. Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling , 2020, ICLR.
[36] Gabriel Synnaeve,et al. Scaling Up Online Speech Recognition Using ConvNets , 2020, INTERSPEECH.
[37] Tara N. Sainath,et al. A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[38] Qian Zhang,et al. Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Sanjeev Khudanpur,et al. Audio augmentation for speech recognition , 2015, INTERSPEECH.