A Sequential Matching Framework for Multi-Turn Response Selection in Retrieval-Based Chatbots

We study the problem of response selection for multi-turn conversation in retrieval-based chatbots. The task involves matching a response candidate with a conversation context, the challenges for which include how to recognize important parts of the context, and how to model the relationships among utterances in the context. Existing matching methods may lose important information in contexts as we can interpret them with a unified framework in which contexts are transformed to fixed-length vectors without any interaction with responses before matching. This motivates us to propose a new matching framework that can sufficiently carry important information in contexts to matching and model relationships among utterances at the same time. The new framework, which we call a sequential matching framework (SMF), lets each utterance in a context interact with a response candidate at the first step and transforms the pair to a matching vector. The matching vectors are then accumulated following the order of the utterances in the context with a recurrent neural network (RNN) that models relationships among utterances. Context-response matching is then calculated with the hidden states of the RNN. Under SMF, we propose a sequential convolutional network and sequential attention network and conduct experiments on two public data sets to test their performance. Experiment results show that both models can significantly outperform state-of-the-art matching methods. We also show that the models are interpretable with visualizations that provide us insights on how they capture and leverage important information in contexts for matching.

[1]  A. A. Mullin,et al.  Principles of neurodynamics , 1962 .

[2]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[3]  Joseph Weizenbaum,et al.  ELIZA—a computer program for the study of natural language communication between man and machine , 1966, CACM.

[4]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[5]  Jong Kyoung Kim,et al.  Speech recognition , 1983, 1983 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[6]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[7]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[8]  Wayne H. Ward,et al.  Speech recognition , 1997 .

[9]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[10]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track , 2001, LREC.

[11]  Milica Gasic,et al.  The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management , 2010, Comput. Speech Lang..

[12]  Alan Ritter,et al.  Data-Driven Response Generation in Social Media , 2011, EMNLP.

[13]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[15]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16]  Hao Wang,et al.  A Dataset for Research on Short-Text Conversations , 2013, EMNLP.

[17]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[18]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[19]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[20]  Ryuichiro Higashinaka,et al.  Towards an open-domain conversational system fully based on natural language processing , 2014, COLING.

[21]  Yelong Shen,et al.  A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.

[22]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[23]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[24]  Hang Li,et al.  An Information Retrieval Approach to Short Text Conversation , 2014, ArXiv.

[25]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[26]  Qun Liu,et al.  Syntax-based Deep Matching of Short Texts , 2015, IJCAI.

[27]  Jianfeng Gao,et al.  A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.

[28]  Xuanjing Huang,et al.  Convolutional Neural Tensor Network Architecture for Community-Based Question Answering , 2015, IJCAI.

[29]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[30]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[31]  Joelle Pineau,et al.  The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems , 2015, SIGDIAL Conference.

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33]  Rudolf Kadlec,et al.  Improved Deep Learning Baselines for Ubuntu Corpus Dialogs , 2015, ArXiv.

[34]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[35]  Bowen Zhou,et al.  LSTM-based Deep Learning Models for non-factoid answer selection , 2015, ArXiv.

[36]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[37]  Wenpeng Yin,et al.  MultiGranCNN: An Architecture for General Matching of Text Chunks on Multiple Levels of Granularity , 2015, ACL.

[38]  Pengfei Liu,et al.  Modelling Interaction of Sentence Pair with Coupled-LSTMs , 2016, EMNLP.

[39]  Bowen Zhou,et al.  Improved Representation Learning for Question Answer Matching , 2016, ACL.

[40]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[41]  Zhen-Hua Ling,et al.  Enhancing and Combining Sequential and Tree LSTM for Natural Language Inference , 2016, ArXiv.

[42]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[43]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[44]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[45]  Jimmy J. Lin,et al.  Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement , 2016, NAACL.

[46]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[47]  Xuan Liu,et al.  Multi-view Response Selection for Human-Computer Conversation , 2016, EMNLP.

[48]  Jun Zhao,et al.  Inner Attention based Recurrent Neural Networks for Answer Selection , 2016, ACL.

[49]  Shuohang Wang,et al.  Learning Natural Language Inference with LSTM , 2015, NAACL.

[50]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[51]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[52]  Rui Yan,et al.  Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System , 2016, SIGIR.

[53]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[54]  Xueqi Cheng,et al.  Match-SRNN: Modeling the Recursive Matching Structure with Spatial RNN , 2016, IJCAI.

[55]  Rui Yan,et al.  Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation , 2016, COLING.

[56]  Xuanjing Huang,et al.  Deep Fusion LSTMs for Text Semantic Matching , 2016, ACL.

[57]  Xueqi Cheng,et al.  A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations , 2015, AAAI.

[58]  Zhen Xu,et al.  Incorporating Loose-Structured Knowledge into LSTM with Recall Gate for Conversation Modeling , 2016, ArXiv.

[59]  Wei-Ying Ma,et al.  Topic Augmented Neural Response Generation with a Joint Attention Mechanism , 2016, ArXiv.

[60]  Alan Ritter,et al.  Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[61]  Shuohang Wang,et al.  A Compare-Aggregate Model for Matching Text Sequences , 2016, ICLR.

[62]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[63]  Zhen Xu,et al.  Incorporating loose-structured knowledge into conversation modeling via recall-gate LSTM , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).

[64]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[65]  Zhoujun Li,et al.  Sequential Match Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots , 2016, ArXiv.

[66]  Bowen Zhou,et al.  Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation , 2016, AAAI.

[67]  Wei-Ying Ma,et al.  Topic Aware Neural Response Generation , 2016, AAAI.

[68]  Xiaoyan Zhu,et al.  Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory , 2017, AAAI.

[69]  Zhoujun Li,et al.  Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots , 2018, ACL.

[70]  Wei-Ying Ma,et al.  Hierarchical Recurrent Attention Network for Response Generation , 2017, AAAI.

[71]  Zhoujun Li,et al.  Neural Response Generation with Dynamic Vocabularies , 2017, AAAI.