Improving Spoken Language Understanding with Cross-Modal Contrastive Learning
暂无分享,去创建一个
Hao Li | P. Zhou | Jingjing Dong | Xiaorui Wang | Jiayi Fu
[1] Siegfried Kunzmann,et al. Tie Your Embeddings Down: Cross-Modal Latent Spaces for End-to-end Spoken Language Understanding , 2020, ICASSP.
[2] Maulik C. Madhavi,et al. Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification , 2021, Interspeech.
[3] Florian Metze,et al. Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding , 2021, Interspeech.
[4] Michael Zeng,et al. SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding , 2021, NAACL.
[5] Wen Wang,et al. Pre-training for Spoken Language Understanding with Joint Textual and Phonetic Representation Learning , 2021, Interspeech.
[6] Michael Picheny,et al. Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs , 2021, Interspeech.
[7] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[8] Maulik C. Madhavi,et al. Leveraging Acoustic and Linguistic Embeddings from Pretrained Speech and Language Models for Intent Classification , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Seongjin Shin,et al. Two-Stage Textual Knowledge Distillation for End-to-End Spoken Language Understanding , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Gyuwan Kim,et al. St-Bert: Cross-Modal Language Model Pre-Training for End-to-End Spoken Language Understanding , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Siegfried Kunzmann,et al. End-to-End Neural Transformer Based Spoken Language Understanding , 2020, INTERSPEECH.
[12] Ngoc Thang Vu,et al. Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning , 2020, INTERSPEECH.
[13] Joakim Lindblad,et al. CoMIR: Contrastive Multimodal Image Representation for Registration , 2020, NeurIPS.
[14] Nam Soo Kim,et al. Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation , 2020, INTERSPEECH.
[15] Yu Zhang,et al. Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.
[16] Srinivas Bangalore,et al. Improved End-To-End Spoken Utterance Classification with a Self-Attention Acoustic Classifier , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Yoshua Bengio,et al. Speech Model Pre-training for End-to-End Spoken Language Understanding , 2019, INTERSPEECH.
[18] James R. Glass,et al. Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input , 2018, International Journal of Computer Vision.
[19] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[20] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[21] Francesco Caltagirone,et al. Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces , 2018, ArXiv.
[22] Yongqiang Wang,et al. Towards End-to-end Spoken Language Understanding , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Dilek Z. Hakkani-Tür,et al. Deep Learning for Dialogue Systems , 2017, COLING.
[24] Alexandros Potamianos,et al. Speech understanding for spoken dialogue systems: From corpus harvesting to grammar rule induction , 2018, Comput. Speech Lang..
[25] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[27] Gabriel Skantze,et al. User Feedback in Human-Robot Dialogue : Task Progression and Uncertainty , 2014, HRI 2014.
[28] Gokhan Tur,et al. Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , 2011 .
[29] David Suendermann,et al. SLU in Commercial and Research Spoken Dialogue Systems , 2011 .
[30] Gary Geunbae Lee,et al. Recent Approaches to Dialog Management for Spoken Dialog Systems , 2010, J. Comput. Sci. Eng..
[31] Dong Yu,et al. An Integrative and Discriminative Technique for Spoken Utterance Classification , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[32] Cheng Wu,et al. Language model estimation for optimizing end-to-end performance of a natural language call routing system , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[33] Joseph Polifroni,et al. A form-based dialogue manager for spoken language applications , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.