Contrastive Zero-Shot Learning for Cross-Domain Slot Filling with Adversarial Attack

Zero-shot slot filling has widely arisen to cope with data scarcity in target domains. However, previous approaches often ignore constraints between slot value representation and related slot description representation in the latent space and lack enough model robustness. In this paper, we propose a Contrastive Zero-Shot Learning with Adversarial Attack (CZSL-Adv) method for the cross-domain slot filling. The contrastive loss aims to map slot value contextual representations to the corresponding slot description representations. And we introduce an adversarial attack training strategy to improve model robustness. Experimental results show that our model significantly outperforms state-of-the-art baselines under both zero-shot and few-shot settings.

[1]  Bing Liu,et al.  Recurrent Neural Network Structured Output Prediction for Spoken Language Understanding , 2015 .

[2]  Keqing He,et al.  Multi-Level Cross-Lingual Transfer Learning With Language Shared and Specific Knowledge for Spoken Language Understanding , 2020, IEEE Access.

[3]  Gökhan Tür,et al.  Towards Zero-Shot Frame Semantic Parsing for Domain Scaling , 2017, INTERSPEECH.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Andrew M. Dai,et al.  Adversarial Training Methods for Semi-Supervised Text Classification , 2016, ICLR.

[6]  Laurens van der Maaten,et al.  Self-Supervised Learning of Pretext-Invariant Representations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Srinivasan Parthasarathy,et al.  Open Intent Extraction from Natural Language Interactions , 2020, WWW.

[8]  Pascale Fung,et al.  Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling , 2020, ACL.

[9]  Xiaoli Z. Fern,et al.  Description-Based Zero-shot Fine-Grained Entity Typing , 2019, NAACL.

[10]  Meina Song,et al.  A Novel Bi-directional Interrelated Model for Joint Intent Detection and Slot Filling , 2019, ACL.

[11]  Yuanmeng Yan,et al.  Learning Label-Relational Output Structure for Adaptive Sequence Labeling , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[12]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[13]  Bing Liu,et al.  Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[14]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yuanmeng Yan,et al.  Learning to Tag OOV Tokens by Integrating Contextual Representation and Background Knowledge , 2020, ACL.

[16]  Sungjin Lee,et al.  Zero-Shot Adaptive Transfer for Conversational Language Understanding , 2018, AAAI.

[17]  Dilek Z. Hakkani-Tür,et al.  Robust Zero-Shot Cross-Domain Slot Filling with Example Values , 2019, ACL.

[18]  Francesco Caltagirone,et al.  Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces , 2018, ArXiv.

[19]  Iyad Rahwan,et al.  Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm , 2017, EMNLP.

[20]  Chih-Li Huo,et al.  Slot-Gated Modeling for Joint Slot Filling and Intent Prediction , 2018, NAACL.

[21]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[22]  Pascale Fung,et al.  Zero-Resource Cross-Domain Named Entity Recognition , 2020, REPL4NLP.

[23]  Phillip Isola,et al.  Contrastive Multiview Coding , 2019, ECCV.