Investigating Adaptation and Transfer Learning for End-to-End Spoken Language Understanding from Speech
暂无分享,去创建一个
Yannick Estève | Natalia A. Tomashenko | Antoine Caubrière | N. Tomashenko | Y. Estève | Antoine Caubrière
[1] Guillaume Gravier,et al. The ester 2 evaluation campaign for the rich transcription of French radio broadcasts , 2009, INTERSPEECH.
[2] Olivier Pietquin,et al. Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation , 2016, NIPS 2016.
[3] Yannick Estève,et al. Simulating ASR errors for training SLU systems , 2018, LREC.
[4] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[5] Srinivas Bangalore,et al. Spoken Language Understanding without Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Yajie Miao,et al. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[7] Ying Zhang,et al. Batch normalized recurrent neural networks , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[9] Natalia A. Tomashenko,et al. Speaker adaptation of context dependent deep neural networks based on MAP-adaptation and GMM-derived feature processing , 2014, INTERSPEECH.
[10] Olivier Galibert,et al. The ETAPE corpus for the evaluation of speech-based TV content processing in the French language , 2012, LREC.
[11] Olivier Galibert,et al. Proposal for an Extension of Traditional Named Entities: From Guidelines to Evaluation, an Overview , 2011, Linguistic Annotation Workshop.
[12] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[13] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[14] Guillaume Gravier,et al. Is it time to Switch to word embedding and recurrent neural networks for spoken language understanding? , 2015, INTERSPEECH.
[15] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[16] Yongqiang Wang,et al. Towards End-to-end Spoken Language Understanding , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Paul Deléglise,et al. Enhancing the TED-LIUM Corpus with Selected Data for Language Modeling and More TED Talks , 2014, LREC.
[18] Benoît Favre,et al. Robustesse et portabilités multilingue et multi-domaines des systèmes de compréhension de la parole : les corpus du projet PortMedia (Robustness and portability of spoken language understanding systems among languages and domains : the PORTMEDIA project) [in French] , 2012, JEP/TALN/RECITAL.
[19] Frédéric Béchet,et al. The EPAC Corpus: Manual and Automatic Annotations of Conversational Speech in French Broadcast News , 2010, LREC.
[20] Samy Bengio,et al. Tacotron: Towards End-to-End Speech Synthesis , 2017, INTERSPEECH.
[21] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[22] Yannick Estève,et al. Curriculum-based transfer learning for an effective end-to-end spoken language understanding and domain portability , 2019, INTERSPEECH.
[23] Frédéric Béchet,et al. Results of the French Evalda-Media evaluation campaign for literal understanding , 2006, LREC.
[24] Chin-Hui Lee,et al. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..
[25] Shigeru Katagiri,et al. Speaker Adaptation for Multichannel End-to-End Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Yannick Estève,et al. Evaluation of Feature-Space Speaker Adaptation for End-to-End Acoustic Models , 2018, LREC.
[27] Lucia Specia,et al. Semi-Supervised Adaptation of RNNLMs by Fine-Tuning with Domain-Specific Auxiliary Features , 2017, INTERSPEECH.
[28] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[29] Frédéric Béchet,et al. The French MEDIA/EVALDA Project: the Evaluation of the Understanding Capability of Spoken Language Dialogue Systems , 2004, LREC.
[30] Georg Heigold,et al. End-to-end text-dependent speaker verification , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[31] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[32] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[33] George Saon,et al. Speaker adaptation of neural network acoustic models using i-vectors , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[34] Tomohiro Nakatani,et al. Context Adaptive Neural Network for Rapid Adaptation of Deep CNN Based Acoustic Models , 2016, INTERSPEECH.
[35] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[36] Arun Narayanan,et al. From Audio to Semantics: Approaches to End-to-End Spoken Language Understanding , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[37] David Suendermann-Oeft,et al. Exploring ASR-free end-to-end modeling to improve spoken language understanding in a cloud-based dialog system , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[38] Shinji Watanabe,et al. Auxiliary Feature Based Adaptation of End-to-end ASR Systems , 2018, INTERSPEECH.
[39] Yifan Gong,et al. Speaker Adaptation for End-to-End CTC Models , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[40] Frédéric Béchet,et al. DECODA: a call-centre human-human spoken conversation corpus , 2012, LREC.
[41] Yannick Estève,et al. End-To-End Named Entity And Semantic Concept Extraction From Speech , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[42] Olivier Galibert,et al. The REPERE Corpus : a multimodal corpus for person recognition , 2012, LREC.