Federated Reinforcement Learning for Fast Personalization

Understanding user behavior and adapting to it has been an important focus area for applications. That adaptation is commonly called Personalization. Personalization has been sought after in gaming, personal assistants, dialogue managers, and other popular application categories. One of the challenges of personalization methods is the time they take to adapt to the user behavior or reactions. This sometimes is detrimental to user experience. The contribution of this work is twofold: (1) showing the applicability of granular (per user) personalization through the use of reinforcement learning, and (2) proposing a novel mitigation strategy to decrease the personalization time, through federated learning. To our knowledge, this paper is among the first to present an overall architecture for federated reinforcement learning (FRL), which includes the grouping policy, the learning policy, and the federation policy. We demonstrate the efficacy of the proposed architecture on a non-player character in the Atari game Pong, and scale the implementation across 3, 4, and 5 users. We demonstrate the success of the proposal through achieving a median improvement of ~17% on the personalization time.

[1]  Bo Ding,et al.  Real-Time Data Processing Architecture for Multi-Robots Based on Differential Federated Learning , 2018, 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[2]  Hubert Eichner,et al.  APPLIED FEDERATED LEARNING: IMPROVING GOOGLE KEYBOARD QUERY SUGGESTIONS , 2018, ArXiv.

[3]  Georgios B. Giannakis,et al.  LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning , 2018, NeurIPS.

[4]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[5]  Bhargav Upadhyay,et al.  Reinforcement learning for game personalization on edge devices , 2018, 2018 International Conference on Information and Computer Technologies (ICICT).

[6]  Samy Bengio,et al.  Revisiting Distributed Synchronous SGD , 2016, ArXiv.

[7]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[8]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[9]  Tom Goldstein,et al.  Efficient Distributed SGD with Variance Reduction , 2015, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[10]  Qiang Yang,et al.  Federated Reinforcement Learning , 2019, ArXiv.

[11]  Henryk Michalewski,et al.  Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes , 2018, ISC.

[12]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[13]  Brian McDonald,et al.  Intelligent Biofeedback using an Immersive Competitive Environment , 2001 .

[14]  Ming Liu,et al.  Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems , 2019, IEEE Robotics and Automation Letters.