Learning to Query, Reason, and Answer Questions On Ambiguous Texts

A key goal of research in conversational systems is to train an interactive agent to help a user with a task. Human conversation, however, is notoriously incomplete, ambiguous, and full of extraneous detail. To operate effectively, the agent must not only understand what was explicitly conveyed but also be able to reason in the presence of missing or unclear information. When unable to resolve ambiguities on its own, the agent must be able to ask the user for the necessary clarifications and incorporate the response in its reasoning. Motivated by this problem we introduce QRAQ (Query, Reason, and Answer Questions), a new synthetic domain, in which a User gives an Agent a short story and asks a challenge question. These problems are designed to test the reasoning and interaction capabilities of a learningbased Agent in a setting that requires multiple conversational turns. A good Agent should ask only non-deducible, relevant questions until it has enough information to correctly answer the User’s question. We use standard and improved reinforcement learning based memory-network architectures to solve QRAQ problems in the difficult setting where the reward signal only tells the Agent if its final answer to the challenge question is correct or not. To provide an upper-bound to the RL results we also train the same architectures using supervised information that tells the Agent during training which variables to query and the answer to the challenge question. We evaluate our architectures on four QRAQ dataset types, and scale the complexity for each along multiple dimensions.

[1]  Regina Barzilay,et al.  Language Understanding for Text-based Games using Deep Reinforcement Learning , 2015, EMNLP.

[2]  David Vandyke,et al.  Learning from real users: rating dialogue success with neural networks for reinforcement learning in spoken dialogue systems , 2015, INTERSPEECH.

[3]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[4]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[5]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[6]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[7]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[8]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[9]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[10]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[11]  Jason Weston,et al.  Dialog-based Language Learning , 2016, NIPS.

[12]  Geoffrey E. Hinton,et al.  Generating Text with Recurrent Neural Networks , 2011, ICML.

[13]  Jason Weston,et al.  Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[14]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Xiang Zhang,et al.  Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems , 2015, ICLR.

[16]  Joelle Pineau,et al.  The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems , 2015, SIGDIAL Conference.

[17]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[18]  Jason Weston,et al.  Towards Understanding Situated Natural Language , 2010, AISTATS.

[19]  David Vandyke,et al.  Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems , 2015, SIGDIAL Conference.

[20]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[21]  Geoffrey Zweig,et al.  End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning , 2016, ArXiv.

[22]  Matthew Henderson,et al.  Word-Based Dialog State Tracking with Recurrent Neural Networks , 2014, SIGDIAL Conference.

[23]  Rudolf Kadlec,et al.  Improved Deep Learning Baselines for Ubuntu Corpus Dialogs , 2015, ArXiv.

[24]  Lex Weaver,et al.  The Optimal Reward Baseline for Gradient-Based Reinforcement Learning , 2001, UAI.

[25]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[26]  Milica Gasic,et al.  POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  Omer Levy,et al.  Published as a conference paper at ICLR 2018 S IMULATING A CTION D YNAMICS WITH N EURAL P ROCESS N ETWORKS , 2018 .

[29]  Jason Weston,et al.  Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.