A Comparison of End-to-End Models for Long-Form Speech Recognition
暂无分享,去创建一个
Tara N. Sainath | Arun Narayanan | Yonghui Wu | Zhifeng Chen | Rohit Prabhavalkar | Shuyuan Zhang | Wei Han | Chung-Cheng Chiu | Anjuli Kannan | Ruoming Pang | Patrick Nguyen | Yu Zhang | Hank Liao | Sergey Kishchenko | Tara Sainath | Z. Chen | Yonghui Wu | C. Chiu | Rohit Prabhavalkar | Anjuli Kannan | H. Liao | Wei Han | Yu Zhang | Ruoming Pang | A. Narayanan | Shuyuan Zhang | Patrick Nguyen | S. Kishchenko
[1] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Hagen Soltau,et al. Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition , 2016, INTERSPEECH.
[3] Tara N. Sainath,et al. Streaming End-to-end Speech Recognition for Mobile Devices , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Tara N. Sainath,et al. Improving the efficiency of forward-backward algorithm using batched computation in TensorFlow , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[5] Tara N. Sainath,et al. Compression of End-to-End Models , 2018, INTERSPEECH.
[6] Arun Narayanan,et al. Toward Domain-Invariant Speech Recognition via Large Scale Training , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[7] Khe Chai Sim,et al. Efficient Implementation of Recurrent Neural Network Transducer in Tensorflow , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[8] Wei Li,et al. Monotonic Infinite Lookback Attention for Simultaneous Machine Translation , 2019, ACL.
[9] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[11] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[12] Hairong Liu,et al. Exploring neural transducers for end-to-end speech recognition , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[13] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[14] Hank Liao,et al. Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[15] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[16] Satoshi Nakamura,et al. Local Monotonic Attention Mechanism for End-to-End Speech And Language Processing , 2017, IJCNLP.
[17] Colin Raffel,et al. Online and Linear-Time Attention by Enforcing Monotonic Alignments , 2017, ICML.
[18] Tara N. Sainath,et al. Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling , 2019, ArXiv.
[19] Colin Raffel,et al. Monotonic Chunkwise Attention , 2017, ICLR.
[20] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[21] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[22] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[23] Yu Zhang,et al. Very deep convolutional networks for end-to-end speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).