暂无分享,去创建一个
Geoffrey Zweig | Satoshi Nakamura | Xiaohui Zhang | Yongqiang Wang | Gabriel Synnaeve | Andros Tjandra | Chunxi Liu | Frank Zhang
[1] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Kaisheng Yao,et al. KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[3] Taku Kudo,et al. Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates , 2018, ACL.
[4] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[5] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[6] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[7] Quoc V. Le,et al. SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.
[8] Noah Constant,et al. Character-Level Language Modeling with Deeper Self-Attention , 2018, AAAI.
[9] Dustin Tran,et al. Image Transformer , 2018, ICML.
[10] Shuang Xu,et al. Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Geoffrey Zweig,et al. From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[12] Hermann Ney,et al. RWTH ASR Systems for LibriSpeech: Hybrid vs Attention - w/o Data Augmentation , 2019, INTERSPEECH.
[13] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[14] Kaisheng Yao,et al. Adaptation of context-dependent deep neural networks for automatic speech recognition , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[15] Ronan Collobert,et al. Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions , 2019, INTERSPEECH.
[16] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[17] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[18] Luke S. Zettlemoyer,et al. Transformers with convolutional context for ASR , 2019, ArXiv.
[19] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Nicolas Usunier,et al. Fully Convolutional Speech Recognition , 2018, ArXiv.
[21] Tomohiro Nakatani,et al. Context adaptive deep neural networks for fast acoustic model adaptation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Matthias Sperber,et al. Self-Attentional Acoustic Models , 2018, INTERSPEECH.
[23] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[24] Rohit Prabhavalkar,et al. Exploring architectures, data and units for streaming end-to-end speech recognition with RNN-transducer , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[25] Xiaofei Wang,et al. A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[26] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[27] Yifan Gong,et al. Factorized adaptation for deep neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Yifan Gong,et al. Self-Teaching Networks , 2019, INTERSPEECH.
[29] Fernando Pereira,et al. Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..