论文信息 - Supplementary Material : Zero-Shot Task Generalization with MultiTask Deep Reinforcement Learning

Supplementary Material : Zero-Shot Task Generalization with MultiTask Deep Reinforcement Learning

Inter/Extrapolation. In this experiment, a task is defined by three parameters: action, object, and number. The agent should repeat the same subtask for a given number of times. The agent is trained on all configurations of actions and target objects. However, only a subset of numbers is used during training. In order to interpolate and extrapolate, we define analogies based on simple arithmetic such as:

Honglak Lee | Junhyuk Oh | Pushmeet Kohli | Satinder Singh

[1] Geoffrey E. Hinton,et al. Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.

[2] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[3] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[4] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.

[5] Luca Bertinetto,et al. Learning feed-forward one-shot learners , 2016, NIPS.

[6] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.

[7] Rob Fergus,et al. MazeBase: A Sandbox for Learning from Games , 2015, ArXiv.

[8] Sanja Fidler,et al. Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).