TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty
暂无分享,去创建一个
Binbin Zhang | Di Wu | Zhiyong Wu | Changbao Zhu | Fuping Pan | Zhendong Peng | Xingcheng Song | Wenpeng Li | Yuekai Zhang
[1] Chao Weng,et al. Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks , 2022, ICLR.
[2] Linfu Xie,et al. WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit , 2022, INTERSPEECH.
[3] Xin Lei,et al. U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition , 2021, ArXiv.
[4] Rohit Prabhavalkar,et al. Dissecting User-Perceived Latency of On-Device E2E Speech Recognition , 2021, Interspeech.
[5] Tara N. Sainath,et al. FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[7] Hao Zheng,et al. AISHELL-1: An open-source Mandarin speech corpus and a speech recognition baseline , 2017, 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA).
[8] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[9] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Andrew W. Senior,et al. Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.
[11] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[13] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.