Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation
暂无分享,去创建一个
[1] Daniel Povey,et al. Pruned RNN-T for fast, memory-efficient ASR training , 2022, INTERSPEECH.
[2] Edward Z. Yang,et al. Torchaudio: Building Blocks for Audio and Speech Processing , 2021, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Peter C. Ma,et al. Ten Lessons From Three Generations Shaped Google’s TPUv4i : Industrial Product , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[4] Geoffrey Zweig,et al. Benchmarking LF-MMI, CTC And RNN-T Criteria For Streaming ASR , 2020, 2021 IEEE Spoken Language Technology Workshop (SLT).
[5] Qian Zhang,et al. Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Yifan Gong,et al. Improving RNN Transducer Modeling for End-to-End Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[7] Boris Ginsburg,et al. NeMo: a toolkit for building AI applications using Neural Modules , 2019, ArXiv.
[8] Tara N. Sainath,et al. Streaming End-to-end Speech Recognition for Mobile Devices , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Khe Chai Sim,et al. Efficient Implementation of Recurrent Neural Network Transducer in Tensorflow , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[10] Shinji Watanabe,et al. ESPnet: End-to-End Speech Processing Toolkit , 2018, INTERSPEECH.
[11] Rohit Prabhavalkar,et al. Exploring architectures, data and units for streaming end-to-end speech recognition with RNN-transducer , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[12] Hairong Liu,et al. Exploring neural transducers for end-to-end speech recognition , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[13] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[15] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[16] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.