Supplementary Material : Zero-Shot Task Generalization with MultiTask Deep Reinforcement Learning

Inter/Extrapolation. In this experiment, a task is defined by three parameters: action, object, and number. The agent should repeat the same subtask for a given number of times. The agent is trained on all configurations of actions and target objects. However, only a subset of numbers is used during training. In order to interpolate and extrapolate, we define analogies based on simple arithmetic such as: