GL-GIN: Fast and Accurate Non-Autoregressive Model for Joint Multiple Intent Detection and Slot Filling

Multi-intent SLU can handle multiple intents in an utterance, which has attracted increasing attention. However, the state-of-the-art joint models heavily rely on autoregressive approaches, resulting in two issues: slow inference speed and information leakage. In this paper, we explore a non-autoregressive model for joint multiple intent detection and slot filling, achieving more fast and accurate. Specifically, we propose a Global-Locally Graph Interaction Network (GL-GIN) where a local slot-aware graph interaction layer is proposed to model slot dependency for alleviating uncoordinated slots problem while a global intent-slot graph interaction layer is introduced to model the interaction between multiple intents and all slots in the utterance. Experimental results on two public datasets show that our framework achieves state-of-the-art performance while being 11.5 times faster.

[1]  Libo Qin,et al.  A Survey on Spoken Language Understanding: Recent Advances and New Frontiers , 2021, IJCAI.

[2]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[3]  Libo Qin,et al.  A Co-Interactive Transformer for Joint Slot Filling and Intent Detection , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Gokhan Tur,et al.  Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , 2011 .

[5]  Francesco Caltagirone,et al.  Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces , 2018, ArXiv.

[6]  Xiao Xu,et al.  AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent Detection and Slot Filling , 2020, Findings of the Association for Computational Linguistics: EMNLP 2020.

[7]  Philip S. Yu,et al.  Zero-shot User Intent Detection via Capsule Neural Networks , 2018, EMNLP.

[8]  Linmei Hu,et al.  Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification , 2019, EMNLP.

[9]  George R. Doddington,et al.  The ATIS Spoken Language Systems Pilot Corpus , 1990, HLT.

[10]  Houfeng Wang,et al.  A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding , 2016, IJCAI.

[11]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[12]  Bing Liu,et al.  Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[13]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Meina Song,et al.  A Novel Bi-directional Interrelated Model for Joint Intent Detection and Slot Filling , 2019, ACL.

[16]  Rashmi Gangadharaiah,et al.  Joint Multiple Intent Detection and Slot Labeling for Goal-Oriented Dialog , 2019, NAACL.

[17]  Danqi Chen,et al.  of the Association for Computational Linguistics: , 2001 .

[18]  Philip S. Yu,et al.  Joint Slot Filling and Intent Detection via Capsule Neural Networks , 2018, ACL.

[19]  Ruhi Sarikaya,et al.  Convolutional neural network based triangular CRF for joint intent detection and slot filling , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[20]  Hongxia Jin,et al.  A Bi-Model Based RNN Semantic Frame Parsing Model for Intent Detection and Slot Filling , 2018, NAACL.

[21]  Kaisheng Yao,et al.  Handling Rare Entities for Neural Sequence Labeling , 2020, ACL.

[22]  Gary Geunbae Lee,et al.  Two-stage multi-intent detection for spoken language understanding , 2017, Multimedia Tools and Applications.

[23]  Rui Zhang,et al.  Graph-based Neural Multi-Document Summarization , 2017, CoNLL.

[24]  Milica Gasic,et al.  POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[25]  Chih-Li Huo,et al.  Slot-Gated Modeling for Joint Slot Filling and Intent Prediction , 2018, NAACL.

[26]  Lemao Liu,et al.  Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition , 2020, ICLR.

[27]  Jie Zhou,et al.  CM-Net: A Novel Collaborative Memory Network for Spoken Language Understanding , 2019, EMNLP.

[28]  Xuanjing Huang,et al.  Contextualized Non-local Neural Networks for Sequence Learning , 2018, AAAI.

[29]  Gökhan Tür,et al.  Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM , 2016, INTERSPEECH.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Libo Qin,et al.  Injecting Word Information with Multi-Level Word Adapter for Chinese Spoken Language Understanding , 2020, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32]  Libo Qin,et al.  Co-GAT: A Co-Interactive Graph Attention Network for Joint Dialog Act Recognition and Sentiment Classification , 2020, AAAI.

[33]  Yangming Li,et al.  Knowing Where to Leverage: Context-Aware Graph Convolutional Network With an Adaptive Fusion Layer for Contextual Spoken Language Understanding , 2021, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[34]  Stefano Bragaglia,et al.  Graph Convolutional Networks for Named Entity Recognition , 2017, TLT.

[35]  E. Cambria,et al.  Recent advances in deep learning based dialogue systems: a systematic survey , 2021, Artificial Intelligence Review.

[36]  Yangming Li,et al.  A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding , 2019, EMNLP.

[37]  Kathleen M. Carley,et al.  Syntax-Aware Aspect Level Sentiment Classification with Graph Attention Networks , 2019, EMNLP.

[38]  Di Wu,et al.  SlotRefine: A Fast Non-Autoregressive Model for Joint Intent Detection and Slot Filling , 2020, EMNLP.

[39]  Xiaocheng Feng,et al.  Incorporating Commonsense Knowledge into Abstractive Dialogue Summarization via Heterogeneous Graph Networks , 2020, CCL.

[40]  Xiaocheng Feng,et al.  Dialogue Discourse-Aware Graph Convolutional Networks for Abstractive Meeting Summarization , 2020, ArXiv.