Federated Deep Reinforcement Learning

In deep reinforcement learning, building policies of high-quality is challenging when the feature space of states is small and the training data is limited. Despite the success of previous transfer learning approaches in deep reinforcement learning, directly transferring data or models from an agent to another agent is often not allowed due to the privacy of data and/or models in many privacy-aware applications. In this paper, we propose a novel deep reinforcement learning framework to federatively build models of high-quality for agents with consideration of their privacies, namely Federated deep Reinforcement Learning (FedRL). To protect the privacy of data and models, we exploit Gausian differentials on the information shared with each other when updating their local models. In the experiment, we evaluate our FedRL framework in two diverse domains, Grid-world and Text2Action domains, by comparing to various baselines.

[1]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[2]  Luke S. Zettlemoyer,et al.  Reinforcement Learning for Mapping Instructions to Actions , 2009, ACL.

[3]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[4]  Qiang Yang,et al.  Action-model acquisition for planning via transfer learning , 2014, Artif. Intell..

[5]  Martin J. Wainwright,et al.  Privacy Aware Learning , 2012, JACM.

[6]  Hector Muñoz-Avila,et al.  Learning hierarchical task network domains from partially observed plan traces , 2014, Artif. Intell..

[7]  Jakub Konecný,et al.  Federated Optimization: Distributed Optimization Beyond the Datacenter , 2015, ArXiv.

[8]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[9]  Shimon Whiteson,et al.  Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[10]  Pieter Abbeel,et al.  Value Iteration Networks , 2016, NIPS.

[11]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[12]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[13]  Dorian Kodelja,et al.  Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.

[14]  Subbarao Kambhampati,et al.  Model-lite planning: Case-based vs. model-based approaches , 2017, Artif. Intell..

[15]  Jonathan P. How,et al.  Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability , 2017, ICML.

[16]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[17]  Joel Z. Leibo,et al.  Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.

[18]  Sergey Levine,et al.  Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings , 2018, ICML.

[19]  Subbarao Kambhampati,et al.  Extracting Action Sequences from Texts Based on Deep Reinforcement Learning , 2018, IJCAI.

[20]  Sanjiv Kumar,et al.  cpSGD: Communication-efficient and differentially-private distributed SGD , 2018, NeurIPS.

[21]  Marcello Restelli,et al.  Transfer of Value Functions via Variational Methods , 2018, NeurIPS.

[22]  Shimon Whiteson,et al.  QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.

[23]  Toshihiko Yamasaki,et al.  Fully Convolutional Network with Multi-Step Reinforcement Learning for Image Processing , 2018, AAAI.

[24]  Qiang Yang,et al.  Federated Machine Learning , 2019, ACM Trans. Intell. Syst. Technol..

[25]  Shimon Whiteson,et al.  Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2020, J. Mach. Learn. Res..