Chat More If You Like: Dynamic Cue Words Planning to Flow Longer Conversations

To build an open-domain multi-turn conversation system is one of the most interesting and challenging tasks in Artificial Intelligence. Many research efforts have been dedicated to building such dialogue systems, yet few shed light on modeling the conversation flow in an ongoing dialogue. Besides, it is common for people to talk about highly relevant aspects during a conversation. And the topics are coherent and drift naturally, which demonstrates the necessity of dialogue flow modeling. To this end, we present the multi-turn cue-words driven conversation system with reinforcement learning method (RLCw), which strives to select an adaptive cue word with the greatest future credit, and therefore improve the quality of generated responses. We introduce a new reward to measure the quality of cue words in terms of effectiveness and relevance. To further optimize the model for long-term conversations, a reinforcement approach is adopted in this paper. Experiments on real-life dataset demonstrate that our model consistently outperforms a set of competitive baselines in terms of simulated turns, diversity and human evaluation.

[1]  Denny Britz,et al.  Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models , 2017, EMNLP.

[2]  Fumin Shen,et al.  Chat More: Deepening and Widening the Chatting Topic via A Deep Model , 2018, SIGIR.

[3]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[4]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[5]  Wei-Ying Ma,et al.  Topic Augmented Neural Response Generation with a Joint Attention Mechanism , 2016, ArXiv.

[6]  Trevor Darrell,et al.  Sequence to Sequence -- Video to Text , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Jianfeng Gao,et al.  Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access , 2016, ACL.

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  Kenji Araki,et al.  A Casual Conversation System Using Modality and Word Associations Retrieved from the Web , 2008, EMNLP.

[10]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[11]  Bing Liu,et al.  End-to-End Optimization of Task-Oriented Dialogue Model with Deep Reinforcement Learning , 2017, ArXiv.

[12]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[13]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[14]  Zhoujun Li,et al.  Sequential Match Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots , 2016, ArXiv.

[15]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[16]  Elena Paslaru Bontas Simperl,et al.  A Neural Network Approach for Knowledge-Driven Response Generation , 2016, COLING.

[17]  Dongyan Zhao,et al.  Towards Implicit Content-Introducing for Generative Short-Text Conversation Systems , 2017, EMNLP.

[18]  Rui Yan,et al.  Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation , 2016, COLING.

[19]  Pascal Poupart,et al.  Online Sequence-to-Sequence Reinforcement Learning for Open-Domain Conversational Agents , 2016 .

[20]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[21]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[22]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Zhou Yu,et al.  Strategy and Policy Learning for Non-Task-Oriented Conversational Systems , 2016, SIGDIAL Conference.

[25]  Weinan Zhang,et al.  Exploring Implicit Feedback for Open Domain Conversation Generation , 2018, AAAI.

[26]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[27]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[28]  Yann Dauphin,et al.  Deal or No Deal? End-to-End Learning of Negotiation Dialogues , 2017, EMNLP.

[29]  Christopher D. Manning,et al.  A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue , 2017, EACL.

[30]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[31]  Jason Weston,et al.  Dialogue Learning With Human-In-The-Loop , 2016, ICLR.

[32]  Quoc V. Le,et al.  Semi-supervised Sequence Learning , 2015, NIPS.