论文信息 - Imagination-Based Decision Making with Physical Models in Deep Neural Networks

Imagination-Based Decision Making with Physical Models in Deep Neural Networks

Decision-making is challenging in continuous settings where complex sequences of events determine rewards, even when these event sequences are largely observable. In particular, traditional trial-and-error learning strategies may have a hard time associating continuous actions with their reward because of the size of the state space and the complexity of the reward function. Given a model of the world, a different strategy is to use imagination to exploit the knowledge embedded in that model. In this regime, the system directly optimizes the decision for each episode based on predictions from the model. We extend deep learning methods that have been previously used for model-free learning and apply them towards a model-based approach in which an expert is consulted multiple times in the agents’ imagination before it takes an action in the world. We show preliminary results on a difficult physical reasoning task where our model-based approach outperforms a model-free baseline, even when using an inaccurate expert.

[1] Razvan Pascanu,et al. Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[2] Alex Graves,et al. Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[3] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.

[4] Alex Graves,et al. Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.

[5] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[6] Rob Fergus,et al. Learning Physical Intuition of Block Towers by Example , 2016, ICML.

[7] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[8] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.

[9] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[10] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[11] Jessica B. Hamrick,et al. Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.

[12] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[13] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.