Multi-Task CTC Training with Auxiliary Feature Reconstruction for End-to-End Speech Recognition
暂无分享,去创建一个
[1] Yajie Miao,et al. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[2] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.
[3] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[4] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[5] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[6] Xiangang Li,et al. Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling , 2017, ICML.
[7] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[8] Xiaodong Cui,et al. English Conversational Telephone Speech Recognition by Humans and Machines , 2017, INTERSPEECH.
[9] Liang Lu,et al. Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition , 2017, INTERSPEECH.
[10] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[11] John R. Hershey,et al. Joint CTC/attention decoding for end-to-end speech recognition , 2017, ACL.
[12] Brian Kingsbury,et al. Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[13] Shinji Watanabe,et al. Joint CTC-attention based end-to-end speech recognition using multi-task learning , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Hagen Soltau,et al. Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition , 2016, INTERSPEECH.
[15] Ebru Arisoy,et al. Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] Awni Hannun,et al. Sequence Modeling with CTC , 2017 .
[17] Bhuvana Ramabhadran,et al. Direct Acoustics-to-Word Models for English Conversational Speech Recognition , 2017, INTERSPEECH.
[18] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[19] Andrew W. Senior,et al. Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.
[20] Tie-Yan Liu,et al. Dual Learning for Machine Translation , 2016, NIPS.
[21] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Eneko Agirre,et al. Unsupervised Neural Machine Translation , 2017, ICLR.
[23] Yu Zhang,et al. Advances in Joint CTC-Attention Based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM , 2017, INTERSPEECH.
[24] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[25] Lijun Wu,et al. Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.
[26] Karen Livescu,et al. Hierarchical Multitask Learning for CTC-based Speech Recognition , 2018, ArXiv.
[27] Satoshi Nakamura,et al. Listening while speaking: Speech chain by deep learning , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[28] Yu Zhang,et al. A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.
[29] Tomoki Toda,et al. Back-Translation-Style Data Augmentation for end-to-end ASR , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[30] Jonathan Baxter,et al. A Bayesian/Information Theoretic Model of Learning to Learn via Multiple Task Sampling , 1997, Machine Learning.
[31] S. Furui,et al. Cepstral analysis technique for automatic speaker verification , 1981 .
[32] Sebastian Ruder,et al. An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.
[33] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[34] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[35] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[36] Yu Tsao,et al. Discriminative Autoencoders for Acoustic Modeling , 2017, INTERSPEECH.
[37] Liang Lu,et al. Multitask Learning with CTC and Segmental CRF for Speech Recognition , 2017, INTERSPEECH.