论文信息 - Meta Learning for Few-Shot Joint Intent Detection and Slot-Filling

Meta Learning for Few-Shot Joint Intent Detection and Slot-Filling

Intent detection and slot filling are the two main tasks in natural language understanding module in goal oriented conversational agents. Models which optimize these two objectives simultaneously within a single network (joint models) have proven themselves to be superior to mono-objective networks. However, these data-intensive deep learning approaches have not been successful in catering the demand of the industry for adaptable, multilingual dialogue systems. To this end, we cast joint intent detection as an n-way k-shot classification problem and establish it within meta learning setup. Our approach is motivated by the success of meta learning on few-shot image classification tasks. We empirically demonstrate that, our approach can meta-learn a prior from similar tasks under highly resource constrained settings which enable rapid inference on target tasks. First, we show the adaptability of proposed approach by meta learning n-way k-shot joint intent detection using set of intents and evaluating on a completely new set of intents. Second, we exemplify the cross-lingual adaptability by learning a prior, utilizing English utterances and evaluating on Spanish and Thai utterances. Compared to random initialization, our method significantly improves the accuracy in both intent detection and slot-filling.

Uthayasanker Thayasivam | H. S. Bhathiya | Hemanthage S. Bhathiya | Uthayasanker Thayasivam

[1] Bing Liu,et al. Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[2] Thomas Paine,et al. Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions , 2017, ICLR.

[3] Po-Sen Huang,et al. Natural Language to Structured Query Generation via Meta-Learning , 2018, NAACL.

[4] Geoffrey Zweig,et al. Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[5] Gökhan Tür,et al. Intent detection using semantically enriched word embeddings , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[6] Xin Wang,et al. XL-NBT: A Cross-lingual Neural Belief Tracking Framework , 2018, EMNLP.

[7] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[8] Gökhan Tür,et al. Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM , 2016, INTERSPEECH.

[9] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[10] Yongxin Yang,et al. Learning to Generalize: Meta-Learning for Domain Generalization , 2017, AAAI.

[11] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[12] Pascale Fung,et al. Personalizing Dialogue Agents via Meta-Learning , 2019, ACL.

[13] Yuji Matsumoto,et al. Chunking with Support Vector Machines , 2001, NAACL.

[14] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[15] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[16] Gökhan Tür,et al. Optimizing SVMs for complex call classification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[17] Thomas Fang Zheng,et al. Transfer learning for speech and language processing , 2015, 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).

[18] BengioYoshua,et al. Random search for hyper-parameter optimization , 2012 .

[19] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[20] Andreas Stolcke,et al. Recurrent neural network and LSTM models for lexical utterance classification , 2015, INTERSPEECH.

[21] Deniz Yuret,et al. Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.

[22] Dilek Z. Hakkani-Tür,et al. Easy contextual intent prediction and slot detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[24] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[25] Amos J. Storkey,et al. How to train your MAML , 2018, ICLR.

[26] BengioYoshua,et al. Using recurrent neural networks for slot filling in spoken language understanding , 2015 .

[27] Yong Wang,et al. Meta-Learning for Low-Resource Neural Machine Translation , 2018, EMNLP.

[28] Daniel Marcu,et al. What’s in a translation rule? , 2004, NAACL.

[29] Sebastian Schuster,et al. Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog , 2018, NAACL.

[30] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[31] Yuan Li,et al. Joint Intent Detection and Slot Filling with Rules , 2018, CCKS Tasks.

[32] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[33] Yoram Singer,et al. BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.