论文信息 - Exploring Implicit Feedback for Open Domain Conversation Generation - 字舞流文

Exploring Implicit Feedback for Open Domain Conversation Generation

User feedback can be an effective indicator to the success of the human-robot conversation. However, to avoid to interrupt the online real-time conversation process, explicit feedback is usually gained at the end of a conversation. Alternatively, users’ responses usually contain their implicit feedback, such as stance, sentiment, emotion, etc., towards the conversation content or the interlocutors. Therefore, exploring the implicit feedback is a natural way to optimize the conversation generation process. In this paper, we propose a novel reward function which explores the implicit feedback to optimize the future reward of a reinforcement learning based neural conversation model. A simulation strategy is applied to explore the state-action space in training and test. Experimental results show that the proposed approach outperforms the Seq2Seq model and the state-of-the-art reinforcement learning model for conversation generation on automatic and human evaluations on the OpenSubtitles and Twitter datasets.

Weinan Zhang | Lingzhi Li | Ting Liu | Dongyan Cao | Ting Liu | Weinan Zhang | Dongyan Cao | Lingzhi Li

[1] Joelle Pineau,et al. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[2] David Vandyke,et al. Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.

[3] Wei-Ying Ma,et al. Hierarchical Recurrent Attention Network for Response Generation , 2017, AAAI.

[4] Geoffrey Zweig,et al. Attention with Intention for a Neural Network Conversation Model , 2015, ArXiv.

[5] Hang Li,et al. Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[6] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[7] David Vandyke,et al. Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking , 2015, SIGDIAL Conference.

[8] Rafael E. Banchs. Movie-DiC: a Movie Dialogue Corpus for Research and Development , 2012, ACL.

[9] Wojciech Zaremba,et al. Reinforcement Learning Neural Turing Machines - Revised , 2015 .

[10] Wei-Ying Ma,et al. Topic Augmented Neural Response Generation with a Joint Attention Mechanism , 2016, ArXiv.

[11] Jason Weston,et al. Curriculum learning , 2009, ICML '09.

[12] Milica Gasic,et al. The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management , 2010, Comput. Speech Lang..

[13] Milica Gasic,et al. Phrase-Based Statistical Language Generation Using Graphical Models and Active Learning , 2010, ACL.

[14] Alan Ritter,et al. Unsupervised Modeling of Twitter Conversations , 2010, NAACL.

[15] David Vandyke,et al. On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems , 2016, ACL.

[16] Kevin Knight,et al. Generation that Exploits Corpus-Based Statistical Knowledge , 1998, ACL.

[17] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[18] Jianfeng Gao,et al. A Persona-Based Neural Conversation Model , 2016, ACL.

[19] Joelle Pineau,et al. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[20] Quoc V. Le,et al. Semi-supervised Sequence Learning , 2015, NIPS.

[21] Marilyn A. Walker,et al. An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System for Email , 2000, J. Artif. Intell. Res..

[22] Pascal Poupart,et al. Online Sequence-to-Sequence Reinforcement Learning for Open-Domain Conversational Agents , 2016 .

[23] Roger Evans,et al. Empirically-based Control of Natural Language Generation , 2005, ACL.

[24] Yue Zhang,et al. Context-Sensitive Lexicon Features for Neural Sentiment Analysis , 2016, EMNLP.

[25] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[26] Marilyn A. Walker,et al. Trainable Generation of Big-Five Personality Styles through Data-Driven Parameter Estimation , 2008, ACL.

[27] Dongho Kim,et al. Incremental on-line adaptation of POMDP-based dialogue managers to extended domains , 2014, INTERSPEECH.

[28] David Vandyke,et al. Multi-domain Neural Network Language Generation for Spoken Dialogue Systems , 2016, NAACL.

[29] Amy Isard,et al. Individuality and Alignment in Generated Dialogues , 2006, INLG.

[30] Jörg Tiedemann,et al. News from OPUS — A collection of multilingual parallel corpora with tools and interfaces , 2009 .

[31] Elena Paslaru Bontas Simperl,et al. A Neural Network Approach for Knowledge-Driven Response Generation , 2016, COLING.

[32] Bowen Zhou,et al. Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation , 2016, AAAI.

[33] Xiang Li,et al. StalemateBreaker: A Proactive Content-Introducing Approach to Automatic Human-Computer Conversation , 2016, IJCAI.

[34] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[35] David Vandyke,et al. Continuously Learning Neural Dialogue Management , 2016, ArXiv.

[36] Jianfeng Gao,et al. Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[37] Rui Yan,et al. Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation , 2016, COLING.

[38] Alan Ritter,et al. Data-Driven Response Generation in Social Media , 2011, EMNLP.

[39] David Vandyke,et al. A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[40] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.

[41] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[42] Jlfnm Fpoli,et al. Training a Sentence Planner for Spoken Dialogue Using Boosting , 2002 .

[43] Steve J. Young,et al. Stochastic Language Generation in Dialogue using Factored Language Models , 2014, Computational Linguistics.

[44] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[45] Oliver Lemon,et al. Natural Language Generation as Planning Under Uncertainty for Spoken Dialogue Systems , 2009, EACL.

[46] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[47] Jason Weston,et al. Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.

[48] Haizhou Li,et al. IRIS: a Chat-oriented Dialogue System based on the Vector Space Model , 2012, ACL.

[49] Jianfeng Gao,et al. Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access , 2016, ACL.