A Decentralized Policy Gradient Approach to Multi-task Reinforcement Learning
暂无分享,去创建一个
[1] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[2] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[3] Andre Esteva,et al. A guide to deep learning in healthcare , 2019, Nature Medicine.
[4] Sandeep Chinchali,et al. Multi-agent Reinforcement Learning for Networked System Control , 2020, ICLR.
[5] Jiming Liu,et al. Reinforcement Learning in Healthcare: A Survey , 2019, ACM Comput. Surv..
[6] Volkan Cevher,et al. Optimization for Reinforcement Learning: From a single agent to cooperative agents , 2020, IEEE Signal Processing Magazine.
[7] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[8] Qing Ling,et al. On the Convergence of Decentralized Gradient Descent , 2013, SIAM J. Optim..
[9] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[10] Guanghui Lan. Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes , 2021, ArXiv.
[11] Anna Semakova,et al. Decentralized multi-agent tracking of unknown environmental level sets by a team of nonholonomic robots , 2014, 2014 6th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT).
[12] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[13] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[14] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[15] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[16] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[17] Wolfram Burgard,et al. Socially compliant mobile robot navigation via inverse reinforcement learning , 2016, Int. J. Robotics Res..
[18] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[19] Tamer Basar,et al. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.
[20] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[21] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[22] Thinh T. Doan,et al. Finite-Time Analysis of Distributed TD(0) with Linear Function Approximation on Multi-Agent Reinforcement Learning , 2019, ICML.
[23] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[24] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[25] Alex Olshevsky,et al. Linear Time Average Consensus on Fixed Graphs and Implications for Decentralized Optimization and Multi-Agent Control , 2014, 1411.4186.
[26] H. Vincent Poor,et al. QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations , 2012, IEEE Trans. Signal Process..
[27] Tamer Basar,et al. Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents , 2018, ICML.
[28] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[29] Zhuoran Yang,et al. Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization , 2018, NeurIPS.
[30] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[31] Sham M. Kakade,et al. Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator , 2018, ICML.
[32] Hongyuan Zha,et al. F2A2: Flexible Fully-decentralized Approximate Actor-critic for Cooperative Multi-agent Reinforcement Learning , 2020, ArXiv.
[33] S. Kakade,et al. Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes , 2019, COLT.
[34] Yan Zhang,et al. Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).
[35] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[36] Sergey Levine,et al. Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning , 2018, ICLR.
[37] Katja Hofmann,et al. Decoding multitask DQN in the world of Minecraft , 2019 .
[38] Sergey Levine,et al. Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.
[39] Tao Yang,et al. Distributed Stochastic Gradient Method for Non-Convex Problems with Applications in Supervised Learning , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).
[40] Arijit Raychowdhury,et al. NavREn-Rl: Learning to fly in real environment via end-to-end deep reinforcement learning using monocular images , 2018, 2018 25th International Conference on Mechatronics and Machine Vision in Practice (M2VIP).
[41] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[42] Adam Wierman,et al. Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems , 2019, L4DC.
[43] Hsiu-Chin Lin,et al. Learning task constraints in operational space formulation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[44] Joelle Pineau,et al. Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning , 2019, NeurIPS.
[45] David Filliat,et al. DisCoRL: Continual Reinforcement Learning via Policy Distillation , 2019, ArXiv.
[46] Thinh T. Doan,et al. Finite-Time Analysis of Decentralized Stochastic Approximation with Applications in Multi-Agent and Multi-Task Learning , 2020, 2021 60th IEEE Conference on Decision and Control (CDC).
[47] S. Levine,et al. Gradient Surgery for Multi-Task Learning , 2020, NeurIPS.
[48] Andrea Bonarini,et al. Sharing Knowledge in Multi-Task Deep Reinforcement Learning , 2020, ICLR.
[49] Mihailo R. Jovanovic,et al. Fast Multi-Agent Temporal-Difference Learning via Homotopy Stochastic Primal-Dual Optimization , 2019, ArXiv.
[50] Sergey Levine,et al. (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.
[51] Arijit Raychowdhury,et al. Autonomous Navigation via Deep Reinforcement Learning for Resource Constraint Edge Nodes Using Transfer Learning , 2019, IEEE Access.