Exploring neural transducers for end-to-end speech recognition
暂无分享,去创建一个
Hairong Liu | Adam Coates | Eric Battenberg | Sanjeev Satheesh | Zhenyao Zhu | Jitong Chen | Rewon Child | Yashesh Gaur Yi Li | Anuroop Sriram | S. Satheesh | A. Coates | Rewon Child | Zhenyao Zhu | Eric Battenberg | Anuroop Sriram | Hairong Liu | Jitong Chen | Yashesh Gaur Yi Li | Adam Coates | R. Child
[1] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[2] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[3] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[4] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[5] Philipp Koehn,et al. Dirt Cheap Web-Scale Parallel Text from the Common Crawl , 2013, ACL.
[6] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[7] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[8] Daniel Jurafsky,et al. First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs , 2014, ArXiv.
[9] Tara N. Sainath,et al. Acoustic modelling with CD-CTC-SMBR LSTM RNNS , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[10] Yajie Miao,et al. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[11] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[12] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[13] Geoffrey Zweig,et al. Achieving Human Parity in Conversational Speech Recognition , 2016, ArXiv.
[14] Yiming Wang,et al. Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI , 2016, INTERSPEECH.
[15] Gabriel Synnaeve,et al. Wav2Letter: an End-to-End ConvNet-based Speech Recognition System , 2016, ArXiv.
[16] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[18] Liang Lu,et al. Segmental Recurrent Neural Networks for End-to-End Speech Recognition , 2016, INTERSPEECH.
[19] Vaibhava Goel,et al. Dense Prediction on Sequences with Time-Dilated Convolutions for Speech Recognition , 2016, ArXiv.
[20] Yoav Goldberg,et al. Sequence to Sequence Transduction with Hard Monotonic Attention , 2016, ArXiv.
[21] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[22] Yashesh Gaur,et al. Reducing Bias in Production Speech Models , 2017, ArXiv.
[23] Colin Raffel,et al. Online and Linear-Time Attention by Enforcing Monotonic Alignments , 2017, ICML.
[24] Navdeep Jaitly,et al. Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.
[25] Yoav Goldberg,et al. Morphological Inflection Generation with Hard Monotonic Attention , 2016, ACL.
[26] Navdeep Jaitly,et al. An online sequence-to-sequence model for noisy speech recognition , 2017, ArXiv.
[27] Geoffrey Zweig,et al. Advances in all-neural speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Xiaodong Cui,et al. English Conversational Telephone Speech Recognition by Humans and Machines , 2017, INTERSPEECH.
[29] Xiangang Li,et al. Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling , 2017, ICML.