Multilingual Sequence-to-Sequence Speech Recognition: Architecture, Transfer Learning, and Language Modeling
暂无分享,去创建一个
Shinji Watanabe | Sri Harish Reddy Mallidi | Takaaki Hori | Martin Karafiát | Murali Karthick Baskar | Ruizhi Li | Nelson Yalta | Jaejin Cho | Sri Harish Mallidi | Matthew Wiesner | M. Karafiát | Takaaki Hori | Shinji Watanabe | Ruizhi Li | Matthew Wiesner | Jaejin Cho | Nelson Yalta | M. Baskar
[1] Tomas Mikolov,et al. RNNLM - Recurrent Neural Network Language Modeling Toolkit , 2011 .
[2] Alex Graves,et al. Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.
[3] Ying Zhang,et al. Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks , 2016, INTERSPEECH.
[4] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[5] Tara N. Sainath,et al. Multilingual Speech Recognition with a Single End-to-End Model , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] John R. Hershey,et al. Hybrid CTC/Attention Architecture for End-to-End Speech Recognition , 2017, IEEE Journal of Selected Topics in Signal Processing.
[7] John R. Hershey,et al. Language independent end-to-end architecture for joint language identification and speech recognition , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[8] Shinji Watanabe,et al. ESPnet: End-to-End Speech Processing Toolkit , 2018, INTERSPEECH.
[9] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[10] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[11] Martin Karafiát,et al. Adaptation of multilingual stacked bottle-neck neural network structure for new language , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Bhuvana Ramabhadran,et al. End-to-end speech recognition and keyword search on low-resource languages , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Steve Renals,et al. Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).
[15] Alex Graves,et al. Supervised Sequence Labelling , 2012 .
[16] Matthias Sperber,et al. Self-Attentional Acoustic Models , 2018, INTERSPEECH.
[17] Hermann Ney,et al. Multilingual MRASTA features for low-resource keyword search and speech recognition systems , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[18] Jan Cernocký,et al. Multilingual BLSTM and speaker-specific vector adaptation in 2016 but babel system , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[19] Tanja Schultz,et al. Automatic speech recognition for under-resourced languages: A survey , 2014, Speech Commun..
[20] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[21] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[22] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..
[23] Sebastian Stüker,et al. Language Adaptive Multilingual CTC Speech Recognition , 2017, SPECOM.
[24] Hervé Bourlard,et al. Multilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model , 2017, ArXiv.
[25] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[26] Hervé Bourlard,et al. An Investigation of Deep Neural Networks for Multilingual Speech Recognition Training and Adaptation , 2017, INTERSPEECH.
[27] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[28] Yu Zhang,et al. Advances in Joint CTC-Attention Based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM , 2017, INTERSPEECH.
[29] Colin Raffel,et al. Monotonic Chunkwise Attention , 2017, ICLR.
[30] Florian Metze,et al. Sequence-Based Multi-Lingual Low Resource Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.