Imagination-Based Decision Making with Physical Models in Deep Neural Networks

Decision-making is challenging in continuous settings where complex sequences of events determine rewards, even when these event sequences are largely observable. In particular, traditional trial-and-error learning strategies may have a hard time associating continuous actions with their reward because of the size of the state space and the complexity of the reward function. Given a model of the world, a different strategy is to use imagination to exploit the knowledge embedded in that model. In this regime, the system directly optimizes the decision for each episode based on predictions from the model. We extend deep learning methods that have been previously used for model-free learning and apply them towards a model-based approach in which an expert is consulted multiple times in the agents’ imagination before it takes an action in the world. We show preliminary results on a difficult physical reasoning task where our model-based approach outperforms a model-free baseline, even when using an inaccurate expert.