State-of-the-Art Speech Recognition with Sequence-to-Sequence Models
暂无分享,去创建一个
Tara N. Sainath | Navdeep Jaitly | Yonghui Wu | Zhifeng Chen | Michiel Bacchiani | Ron J. Weiss | Rohit Prabhavalkar | Jan Chorowski | Bo Li | Chung-Cheng Chiu | Anjuli Kannan | Patrick Nguyen | Kanishka Rao | Katya Gonina | Navdeep Jaitly | P. Nguyen | T. Sainath | Z. Chen | J. Chorowski | Kanishka Rao | Yonghui Wu | C. Chiu | Rohit Prabhavalkar | Anjuli Kannan | Katya Gonina | Bo Li | M. Bacchiani | N. Jaitly | Patrick Nguyen | Chung-Cheng Chiu
[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[2] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..
[3] Daniel Povey,et al. Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4] Brian Kingsbury,et al. Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[5] Cyril Allauzen,et al. Bayesian Language Model Interpolation for Mobile Speech Input , 2011, INTERSPEECH.
[6] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[7] Alex Graves,et al. Sequence Transduction with Recurrent Neural Networks , 2012, ArXiv.
[8] Mike Schuster,et al. Japanese and Korean voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[10] Andrew W. Senior,et al. Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.
[11] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[12] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[13] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[14] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[15] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[17] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[19] Samy Bengio,et al. An Online Sequence-to-Sequence Model Using Partial Conditioning , 2015, NIPS.
[20] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[21] Tara N. Sainath,et al. Lower Frame Rate Neural Network Acoustic Models , 2016, INTERSPEECH.
[22] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[23] Yu Zhang,et al. Latent Sequence Decompositions , 2016, ICLR.
[24] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[25] Tara N. Sainath,et al. A Comparison of Sequence-to-Sequence Models for Speech Recognition , 2017, INTERSPEECH.
[26] Colin Raffel,et al. Online and Linear-Time Attention by Enforcing Monotonic Alignments , 2017, ICML.
[27] Matt Shannon,et al. Recurrent Neural Aligner: An Encoder-Decoder Neural Network Model for Sequence to Sequence Mapping , 2017, INTERSPEECH.
[28] Navdeep Jaitly,et al. Towards Better Decoding and Language Model Integration in Sequence to Sequence Models , 2016, INTERSPEECH.
[29] Rohit Prabhavalkar,et al. Exploring architectures, data and units for streaming end-to-end speech recognition with RNN-transducer , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[30] Tara N. Sainath,et al. Minimum Word Error Rate Training for Attention-Based Sequence-to-Sequence Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Tara N. Sainath,et al. An Analysis of Incorporating an External Language Model into a Sequence-to-Sequence Model , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Tara N. Sainath,et al. No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Tara N. Sainath,et al. Improving the Performance of Online Neural Transducer Models , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).