Multi-Task Deep Reinforcement Learning for Continuous Action Control

In this paper, we propose a deep reinforcement learning algorithm to learn multiple tasks concurrently. A new network architecture is proposed in the algorithm which reduces the number of parameters needed by more than 75% per task compared to typical single-task deep reinforcement learning algorithms. The proposed algorithm and network fuse images with sensor data and were tested with up to 12 movement-based control tasks on a simulated Pioneer 3AT robot equipped with a camera and range sensors. Results show that the proposed algorithm and network can learn skills that are as good as the skills learned by a comparable single-task learning algorithm. Results also show that learning performance is consistent even when the number of tasks and the number of constraints on the tasks increased.

[1]  Eric Moulines,et al.  An Experimental Evaluation of Reinforcement Learning for Gain Scheduling , 2003, HIS.

[2]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[3]  Christos Dimitrakakis,et al.  Bayesian Multitask Inverse Reinforcement Learning , 2011, EWRL.

[4]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[7]  Peter Kulchyski and , 2015 .

[8]  Alessandro Lazaric,et al.  Transfer in Reinforcement Learning: A Framework and a Survey , 2012, Reinforcement Learning.

[9]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[10]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[11]  Daniel Kudenko,et al.  Parallel reinforcement learning with linear function approximation , 2007, AAMAS '07.

[12]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[13]  Balaraman Ravindran,et al.  Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks , 2016, ArXiv.

[14]  Shakir Mohamed,et al.  Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.

[15]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[16]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[17]  Physical Review , 1965, Nature.

[18]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[19]  Pavel Brazdil,et al.  Proceedings of the European Conference on Machine Learning , 1993 .

[20]  John Shawe-Taylor,et al.  Learning Shared Representations in Multi-task Reinforcement Learning , 2016, ArXiv.

[21]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[22]  Marco Aiello,et al.  AAAI Conference on Artificial Intelligence , 2011, AAAI Conference on Artificial Intelligence.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  G. Uhlenbeck,et al.  On the Theory of the Brownian Motion , 1930 .

[25]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[26]  Balaraman Ravindran,et al.  Exploration for Multi-task Reinforcement Learning with Deep Generative Models , 2016, ArXiv.

[27]  Wolfram Burgard,et al.  Deep reinforcement learning with successor features for navigation across similar environments , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[28]  Bayesian MultiTask Reinforcement Learning , 2010 .