Learning What to Talk about in Descriptive Games

Text generation requires a planning module to select an object of discourse and its properties. This is specially hard in descriptive games, where a computer agent tries to describe some aspects of a game world. We propose to formalize this problem as a Markov Decision Process, in which an optimal message policy can be defined and learned through simulation. Furthermore, we propose back-off policies as a novel and effective technique to fight state dimensionality explosion in this framework.