Graph LSTM with Context-Gated Mechanism for Spoken Language Understanding

Much research in recent years has focused on spoken language understanding (SLU), which usually involves two tasks: intent detection and slot filling. Since Yao et al.(2013), almost all SLU systems are RNN-based, which have been shown to suffer various limitations due to their sequential nature. In this paper, we propose to tackle this task with Graph LSTM, which first converts text into a graph and then utilizes the message passing mechanism to learn the node representation. Not only the Graph LSTM addresses the limitations of sequential models, but it can also help to utilize the semantic correlation between slot and intent. We further propose a context-gated mechanism to make better use of context information for slot filling. Our extensive evaluation shows that the proposed model outperforms the state-of-the-art results by a large margin.

[1]  Ruhi Sarikaya,et al.  Deep belief network based semantic taggers for spoken language understanding , 2013, INTERSPEECH.

[2]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[3]  Gökhan Tür,et al.  What is left to be understood in ATIS? , 2010, 2010 IEEE Spoken Language Technology Workshop.

[4]  Chih-Li Huo,et al.  Slot-Gated Modeling for Joint Slot Filling and Intent Prediction , 2018, NAACL.

[5]  George R. Doddington,et al.  The ATIS Spoken Language Systems Pilot Corpus , 1990, HLT.

[6]  Ruhi Sarikaya,et al.  Convolutional neural network based triangular CRF for joint intent detection and slot filling , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[7]  Nanyun Peng,et al.  Cross-Sentence N-ary Relation Extraction with Graph LSTMs , 2017, TACL.

[8]  Houfeng Wang,et al.  A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding , 2016, IJCAI.

[9]  Liang Li,et al.  A Self-Attentive Model with Gate Mechanism for Spoken Language Understanding , 2018, EMNLP.

[10]  Gökhan Tür,et al.  Use of kernel deep convex networks and end-to-end learning for spoken language understanding , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[11]  Meina Song,et al.  A Novel Bi-directional Interrelated Model for Joint Intent Detection and Slot Filling , 2019, ACL.

[12]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yue Zhang,et al.  Sentence-State LSTM for Text Representation , 2018, ACL.

[14]  Gökhan Tür,et al.  Sentence simplification for spoken language understanding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  Bing Liu,et al.  Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[16]  Geoffrey Zweig,et al.  Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[17]  Geoffrey Zweig,et al.  Spoken language understanding using long short-term memory neural networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[18]  Gökhan Tür,et al.  Optimizing SVMs for complex call classification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[19]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Mari Ostendorf,et al.  Conversation Modeling on Reddit Using a Graph-Structured LSTM , 2017, TACL.

[22]  Sungjin Lee,et al.  ONENET: Joint domain, intent, slot prediction for spoken language understanding , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[23]  Geoffrey Zweig,et al.  Recurrent neural networks for language understanding , 2013, INTERSPEECH.

[24]  Francesco Caltagirone,et al.  Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces , 2018, ArXiv.

[25]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[26]  Hongxia Jin,et al.  A Bi-Model Based RNN Semantic Frame Parsing Model for Intent Detection and Slot Filling , 2018, NAACL.

[27]  Shuicheng Yan,et al.  Semantic Object Parsing with Graph LSTM , 2016, ECCV.

[28]  Gökhan Tür,et al.  Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM , 2016, INTERSPEECH.

[29]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[30]  Xu Sun,et al.  Global Encoding for Abstractive Summarization , 2018, ACL.

[31]  Yue Zhang,et al.  N-ary Relation Extraction using Graph-State LSTM , 2018, EMNLP.