Improving RNN Transducer Modeling for End-to-End Speech Recognition
暂无分享,去创建一个
[1] Yu Zhang,et al. Highway long short-term memory RNNS for distant speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[3] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[4] Liang Lu,et al. Exploring Layer Trajectory LSTM with Depth Processing Units and Attention , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[5] Yifan Gong,et al. Advancing Acoustic-to-Word CTC Model With Attention and Mixed-Units , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[6] Tara N. Sainath,et al. A Comparison of Sequence-to-Sequence Models for Speech Recognition , 2017, INTERSPEECH.
[7] Tara N. Sainath,et al. Modeling Time-Frequency Patterns with LSTM vs. Convolutional Architectures for LVCSR Tasks , 2016, INTERSPEECH.
[8] Khe Chai Sim,et al. Efficient Implementation of Recurrent Neural Network Transducer in Tensorflow , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[9] Tara N. Sainath,et al. Improving the Performance of Online Neural Transducer Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Hairong Liu,et al. Exploring neural transducers for end-to-end speech recognition , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[11] Liang Lu,et al. Improving Layer Trajectory LSTM with Future Context Frames , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Tara N. Sainath,et al. Highway-LSTM and Recurrent Highway Networks for Speech Recognition , 2017, INTERSPEECH.
[13] Johan Schalkwyk,et al. Learning acoustic frame labeling for speech recognition with recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Yu Zhang,et al. A prioritized grid long short-term memory RNN for speech recognition , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[15] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[16] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[17] Matt Shannon,et al. Recurrent Neural Aligner: An Encoder-Decoder Neural Network Model for Sequence to Sequence Mapping , 2017, INTERSPEECH.
[18] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[19] Yifan Gong,et al. Advancing Connectionist Temporal Classification with Attention Modeling , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[21] Shuang Xu,et al. Multidimensional Residual Learning Based on Recurrent Neural Networks for Acoustic Modeling , 2016, INTERSPEECH.
[22] Jonathan Le Roux,et al. Triggered Attention for End-to-end Speech Recognition , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Hagen Soltau,et al. Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition , 2016, INTERSPEECH.
[24] Geoffrey Zweig,et al. Exploring multidimensional lstms for large vocabulary ASR , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Yifan Gong,et al. Layer Trajectory LSTM , 2018, INTERSPEECH.
[26] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[27] Yongqiang Wang,et al. Simplifying long short-term memory acoustic models for fast training and decoding , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Geoffrey Zweig,et al. LSTM time and frequency recurrence for automatic speech recognition , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[31] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[32] Colin Raffel,et al. Monotonic Chunkwise Attention , 2017, ICLR.
[33] Tara N. Sainath,et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Yifan Gong,et al. Advancing Acoustic-to-Word CTC Model , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[36] Yajie Miao,et al. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[37] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[38] Jungwon Lee,et al. Residual LSTM: Design of a Deep Recurrent Architecture for Distant Speech Recognition , 2017, INTERSPEECH.
[39] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[40] Rohit Prabhavalkar,et al. Exploring architectures, data and units for streaming end-to-end speech recognition with RNN-transducer , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[41] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.