论文信息 - Joint Slot Filling and Intent Detection via Capsule Neural Networks

Joint Slot Filling and Intent Detection via Capsule Neural Networks

Being able to recognize words as slots and detect the intent of an utterance has been a keen issue in natural language understanding. The existing works either treat slot filling and intent detection separately in a pipeline manner, or adopt joint models which sequentially label slots while summarizing the utterance-level intent without explicitly preserving the hierarchical relationship among words, slots, and intents. To exploit the semantic hierarchy for effective modeling, we propose a capsule-based neural network model which accomplishes slot filling and intent detection via a dynamic routing-by-agreement schema. A re-routing schema is proposed to further synergize the slot filling performance using the inferred intent representation. Experiments on two real-world datasets show the effectiveness of our model when compared with other alternative model architectures, as well as existing natural language understanding services.

[1] Philip S. Yu,et al. Bringing semantic structures to user intent detection in online medical queries , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[2] Gang Wang,et al. Understanding user's query intent with wikipedia , 2009, WWW '09.

[3] Mirella Lapata,et al. Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[4] Zhipeng Luo,et al. Conditional Random Fields , 2014 .

[5] Bing Liu,et al. Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[6] Philip S. Yu,et al. Zero-shot User Intent Detection via Capsule Neural Networks , 2018, EMNLP.

[7] Gökhan Tür,et al. What is left to be understood in ATIS? , 2010, 2010 IEEE Spoken Language Technology Workshop.

[8] John Thickstun,et al. CONDITIONAL RANDOM FIELDS , 2016 .

[9] Philip S. Yu,et al. Mining User Intentions from Medical Queries: A Neural Network Based Heterogeneous Jointly Modeling Approach , 2016, WWW.

[10] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[11] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[12] Yidong Chen,et al. Deep Semantic Role Labeling with Self-Attention , 2017, AAAI.

[13] Ruhi Sarikaya,et al. Convolutional neural network based triangular CRF for joint intent detection and slot filling , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[14] Chih-Li Huo,et al. Slot-Gated Modeling for Joint Slot Filling and Intent Prediction , 2018, NAACL.

[15] Min Yang,et al. Investigating Capsule Networks with Dynamic Routing for Text Classification , 2018, EMNLP.

[16] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[17] Geoffrey E. Hinton,et al. Transforming Auto-Encoders , 2011, ICANN.

[18] Gökhan Tür,et al. Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM , 2016, INTERSPEECH.

[19] Gökhan Tür,et al. End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding , 2016, INTERSPEECH.

[20] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[21] Xuanjing Huang,et al. Information Aggregation via Dynamic Routing for Sequence Encoding , 2018, COLING.

[22] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.

[23] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24] Tao Shen,et al. DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding , 2017, AAAI.