Improving Batch Reinforcement Learning Performance through Transfer of Samples

The main objective of transfer in reinforcement learning is to reduce the complexity of learning the solution of a target task by effectively reusing the knowledge retained from solving a set of source tasks. One of the main problems is to avoid negative transfer, that is the transfer of knowledge across tasks that are significantly different that may worsen the learning performance. In this paper, we introduce a novel algorithm that selectively transfers samples (i.e., tuples ) from source to target tasks and that uses them as input for batch reinforcement-learning algorithms. By transferring samples from source tasks that are mostly similar to the target task, we reduce the number of samples actually collected from the target task to learn its solution. We show that the proposed approach is effective in reducing the learning complexity, even when some source tasks are significantly different from the target task.