Multi-turn Intent Determination for Goal-oriented Dialogue systems

Intent determination is one of the main tasks of natural language understanding and dialogue systems, aiming to determine the intent of the user’s input. Incorporating contextual information for intent determination has shown much promise. Recently, memory networks have been used to encode context from the dialogue history at each turn. However, these methods lack general mechanism for encoding intent patterns. In this paper, we investigate the incorporation of intent patterns extracted from regular expression for the multi-turn intent determination task. We propose a novel neural network model with two memories. The first memory encodes context from dialogue history at each turn whereas the second memory contains the information obtained from the regular expression patterns. We evaluate the model on Frames and Key-Value Retrieval datasets, the experimental results demonstrate that encoding intent patterns with memory networks significantly improves the multi-turn intent determination.

[1]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[2]  Geoffrey Zweig,et al.  Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[3]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[4]  Bhuvana Ramabhadran,et al.  Deep belief nets for natural language call-routing , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Dongyan Zhao,et al.  Marrying Up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding , 2018, ACL.

[6]  Andreas Stolcke,et al.  Recurrent neural network and LSTM models for lexical utterance classification , 2015, INTERSPEECH.

[7]  Houfeng Wang,et al.  A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding , 2016, IJCAI.

[8]  Gokhan Tur,et al.  Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , 2011 .

[9]  George R. Doddington,et al.  The ATIS Spoken Language Systems Pilot Corpus , 1990, HLT.

[10]  Bing Liu,et al.  Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling , 2016, INTERSPEECH.

[11]  Hannes Schulz,et al.  Frames: a corpus for adding memory to goal-oriented dialogue systems , 2017, SIGDIAL Conference.

[12]  Gökhan Tür,et al.  End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding , 2016, INTERSPEECH.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Eric P. Xing,et al.  Harnessing Deep Neural Networks with Logic Rules , 2016, ACL.

[16]  Chih-Li Huo,et al.  Slot-Gated Modeling for Joint Slot Filling and Intent Prediction , 2018, NAACL.

[17]  Gökhan Tür,et al.  Towards deeper understanding: Deep convex networks for semantic utterance classification , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Yun-Nung Chen,et al.  How Time Matters: Learning Time-Decay Attention for Contextual Spoken Language Understanding in Dialogues , 2018, NAACL.

[19]  Maurizio Morisio,et al.  Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-oriented Systems , 2018, WWW.

[20]  Gökhan Tür,et al.  Sequential Dialogue Context Modeling for Spoken Language Understanding , 2017, SIGDIAL Conference.

[21]  Dilek Z. Hakkani-Tür,et al.  An Efficient Approach to Encoding Context for Spoken Language Understanding , 2018, INTERSPEECH.

[22]  Ruhi Sarikaya,et al.  Contextual domain classification in spoken language understanding systems using recurrent neural network , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Dilek Z. Hakkani-Tür,et al.  Easy contextual intent prediction and slot detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  Andreas Stolcke,et al.  A comparative study of recurrent neural network models for lexical domain classification , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[26]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[27]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[28]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[29]  Gökhan Tür,et al.  Optimizing SVMs for complex call classification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[30]  Eric P. Xing,et al.  Deep Neural Networks with Massive Learned Knowledge , 2016, EMNLP.

[31]  Jason Weston,et al.  Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.