Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning
暂无分享,去创建一个
[1] Ohad Shamir,et al. Learnability, Stability and Uniform Convergence , 2010, J. Mach. Learn. Res..
[2] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[3] Karl Henrik Johansson,et al. Distributed Event-Triggered Control for Multi-Agent Systems , 2012, IEEE Transactions on Automatic Control.
[4] Alexander J. Smola,et al. Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.
[5] Dan Alistarh,et al. QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.
[6] Alejandro Ribeiro,et al. Stochastic Policy Gradient Ascent in Reproducing Kernel Hilbert Spaces , 2018, IEEE Transactions on Automatic Control.
[7] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.
[8] Marc Peter Deisenroth,et al. Efficient reinforcement learning using Gaussian processes , 2010 .
[9] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[10] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[11] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[12] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[13] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.
[14] Mykel J. Kochenderfer,et al. Cooperative Multi-agent Control Using Deep Reinforcement Learning , 2017, AAMAS Workshops.
[15] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[16] Tamer Basar,et al. Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents , 2018, ICML.
[17] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[18] Dorian Kodelja,et al. Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.
[19] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[20] Andrew W. Moore,et al. Distributed Value Functions , 1999, ICML.
[21] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[22] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[23] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[24] Amnon Shashua,et al. Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.
[25] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[26] Dale Schuurmans,et al. MapReduce for Parallel Reinforcement Learning , 2011, EWRL.
[27] Ashutosh Nayyar,et al. Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach , 2012, IEEE Transactions on Automatic Control.
[28] Georgios B. Giannakis,et al. Bandit Convex Optimization for Scalable and Dynamic IoT Management , 2017, IEEE Internet of Things Journal.
[29] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[30] Alborz Geramifard,et al. Cooperative Mission Planning for Multi-UAV Teams , 2015 .
[31] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[32] Marcello Restelli,et al. Adaptive Batch Size for Safe Policy Gradients , 2017, NIPS.
[33] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[34] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[35] Georgios B. Giannakis,et al. LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning , 2018, NeurIPS.
[36] Georgios B. Giannakis,et al. Communication-Efficient Distributed Reinforcement Learning , 2018, ArXiv.
[37] Yun Yang,et al. Communication-Efficient Distributed Statistical Inference , 2016, Journal of the American Statistical Association.
[38] R. Schapire. The Strength of Weak Learnability , 1990, Machine Learning.
[39] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[40] Marcello Restelli,et al. Stochastic Variance-Reduced Policy Gradient , 2018, ICML.
[41] Yuchen Zhang,et al. DiSCO: Distributed Optimization for Self-Concordant Empirical Loss , 2015, ICML.
[42] Randy H. Katz,et al. A Berkeley View of Systems Challenges for AI , 2017, ArXiv.
[43] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[44] Zhuoran Yang,et al. Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization , 2018, NeurIPS.
[45] Kagan Tumer,et al. General principles of learning-based multi-agent systems , 1999, AGENTS '99.
[46] Jorge Cortés,et al. Event-triggered communication and control of networked systems for multi-agent consensus , 2017, Autom..
[47] Sebastian U. Stich,et al. Local SGD Converges Fast and Communicates Little , 2018, ICLR.
[48] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[49] Jonathan P. How,et al. Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability , 2017, ICML.
[50] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.