Towards Solving Text-based Games by Producing Adaptive Action Spaces

To solve a text-based game, an agent needs to formulate valid text commands for a given context and find the ones that lead to success. Recent attempts at solving text-based games with deep reinforcement learning have focused on the latter, i.e., learning to act optimally when valid actions are known in advance. In this work, we propose to tackle the first task and train a model that generates the set of all valid commands for a given context. We try three generative models on a dataset generated with Textworld. The best model can generate valid commands which were unseen at training and achieve high $F_1$ score on the test set.

[1]  Kavosh Asadi,et al.  Sample-efficient Deep Reinforcement Learning for Dialog Control , 2016, ArXiv.

[2]  Hervé Frezza-Buet,et al.  Sample-efficient batch reinforcement learning for dialogue management optimization , 2011, TSLP.

[3]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[4]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[5]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[6]  Jianfeng Gao,et al.  Deep Reinforcement Learning with a Natural Language Action Space , 2015, ACL.

[7]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[8]  Jing He,et al.  Policy Networks with Two-Stage Training for Dialogue Systems , 2016, SIGDIAL Conference.

[9]  Alan Ritter,et al.  Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[10]  Matthew J. Hausknecht,et al.  TextWorld: A Learning Environment for Text-based Games , 2018, CGW@IJCAI.

[11]  Shie Mannor,et al.  Learning How Not to Act in Text-based Games , 2018, ICLR.

[12]  Samy Bengio,et al.  Order Matters: Sequence to sequence for sets , 2015, ICLR.

[13]  Tong Wang,et al.  A Joint Model for Question Answering and Question Generation , 2017, ArXiv.

[14]  Regina Barzilay,et al.  Language Understanding for Text-based Games using Deep Reinforcement Learning , 2015, EMNLP.

[15]  Jakob Grue Simonsen,et al.  A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion , 2015, CIKM.

[16]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[17]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[18]  Romain Laroche,et al.  Counting to Explore and Generalize in Text-based Games , 2018, ArXiv.

[19]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[20]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[21]  Oliver Lemon,et al.  Author manuscript, published in "European Conference on Speech Communication and Technologies (Interspeech'07), Anvers: Belgium (2007)" Machine Learning for Spoken Dialogue Systems , 2022 .

[22]  Mikulás Zelinka Using reinforcement learning to learn how to play text-based games , 2018, ArXiv.

[23]  Jason Weston,et al.  Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.

[24]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.