Semi-Supervised Few-Shot Intent Classification and Slot Filling

Intent classification (IC) and slot filling (SF) are two fundamental tasks in modern Natural Language Understanding (NLU) systems. Collecting and annotating large amounts of data to train deep learning models for such systems is not scalable. This problem can be addressed by learning from few examples using fast supervised meta-learning techniques such as prototypical networks. In this work, we systematically investigate how contrastive learning and unsupervised data augmentation methods can benefit these existing supervised meta-learning pipelines for jointly modelled IC/SF tasks. Through extensive experiments across standard IC/SF benchmarks (SNIPS and ATIS), we show that our proposed semi-supervised approaches outperform standard supervised meta-learning methods: contrastive losses in conjunction with prototypical networks consistently outperform the existing state-of-the-art for both IC and SF tasks, while data augmentation strategies primarily improve few-shot IC by a significant margin.

[1]  Quoc V. Le,et al.  QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension , 2018, ICLR.

[2]  Ce Liu,et al.  Supervised Contrastive Learning , 2020, NeurIPS.

[3]  Fei Chao,et al.  Task Augmentation by Rotating for Meta-Learning , 2020, ArXiv.

[4]  Francesco Caltagirone,et al.  Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces , 2018, ArXiv.

[5]  André F. T. Martins,et al.  Marian: Fast Neural Machine Translation in C++ , 2018, ACL.

[6]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[7]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[8]  Yi Zhang,et al.  Learning to Classify Intents and Slot Labels Given a Handful of Examples , 2020, NLP4CONVAI.

[9]  Kenton Lee,et al.  Neural Data Augmentation via Example Extrapolation , 2021, ArXiv.

[10]  Myle Ott,et al.  Understanding Back-Translation at Scale , 2018, EMNLP.

[11]  Hugo Larochelle,et al.  Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples , 2019, ICLR.

[12]  Micah Goldblum,et al.  Data Augmentation for Meta-Learning , 2020, ICML.

[13]  Clement Chung,et al.  Encoding Syntactic Knowledge in Transformer Encoder for Intent Detection and Slot Filling , 2020, AAAI.

[14]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[15]  Pushpak Bhattacharyya,et al.  A Deep Learning Based Multi-task Ensemble Model for Intent Detection and Slot Filling in Spoken Language Understanding , 2018, ICONIP.

[16]  Kai Zou,et al.  EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.

[17]  Tianyong Hao,et al.  A Feature-Enriched Method for User Intent Classification by Leveraging Semantic Tag Expansion , 2018, NLPCC.

[18]  Guoyin Wang,et al.  Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding , 2019, ArXiv.

[19]  Quoc V. Le,et al.  Unsupervised Data Augmentation , 2019, ArXiv.

[20]  Benoît Sagot,et al.  What Does BERT Learn about the Structure of Language? , 2019, ACL.

[21]  George R. Doddington,et al.  The ATIS Spoken Language Systems Pilot Corpus , 1990, HLT.

[22]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[23]  Dilek Z. Hakkani-Tür,et al.  Robust Zero-Shot Cross-Domain Slot Filling with Example Values , 2019, ACL.

[24]  Philip S. Yu,et al.  Mining User Intentions from Medical Queries: A Neural Network Based Heterogeneous Jointly Modeling Approach , 2016, WWW.

[25]  Janarthanan Rajendran,et al.  Meta-Learning Requires Meta-Augmentation , 2020, NeurIPS.

[26]  Fuji Ren,et al.  Intention Detection Based on Siamese Neural Network With Triplet Loss , 2020, IEEE Access.