Multi-Agent Reinforcement Learning in Time-varying Networked Systems
暂无分享,去创建一个
[1] Peter Stone,et al. State Abstraction Discovery from Irrelevant State Variables , 2005, IJCAI.
[2] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[3] Marco Pavone,et al. Control of robotic mobility-on-demand systems: A queueing-theoretical perspective , 2014, Int. J. Robotics Res..
[4] Nan Jiang,et al. On Oracle-Efficient PAC RL with Rich Observations , 2018, NeurIPS.
[5] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[6] Adam Wierman,et al. Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward , 2020, NeurIPS.
[7] Michael J. Neely. Optimal Backpressure Routing for Wireless Networks with Multi-Receiver Diversity , 2006 .
[8] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[9] Nan Jiang,et al. Abstraction Selection in Model-based Reinforcement Learning , 2015, ICML.
[10] Robbert van Renesse,et al. The power of epidemics: robust communication for large-scale distributed systems , 2003, CCRV.
[11] Tamer Basar,et al. Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents , 2018, ICML.
[12] Shu Wang,et al. Fundamental Analysis of Full-Duplex Gains in Wireless Networks , 2017, IEEE/ACM Transactions on Networking.
[13] Adam Wierman,et al. Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems , 2019, L4DC.
[14] Adam Wierman,et al. Finite-Time Analysis of Asynchronous Stochastic Approximation and Q-Learning , 2020, COLT.
[15] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[16] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[17] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[18] Renyuan Xu,et al. Q-Learning for Mean-Field Controls , 2020, ArXiv.
[19] Francesco Bullo,et al. On the dynamics of deterministic epidemic propagation over networks , 2017, Annu. Rev. Control..
[20] Christos Faloutsos,et al. Epidemic thresholds in real networks , 2008, TSEC.
[21] Fernando Paganini,et al. Distributed control of spatially invariant systems , 2002, IEEE Trans. Autom. Control..
[22] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[23] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[24] Nader Motee,et al. Optimal Control of Spatially Distributed Systems , 2008, 2007 American Control Conference.
[25] Juan M López,et al. Nonequilibrium phase transition in a model for the propagation of innovations among economic agents. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.
[26] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[27] Yingbin Liang,et al. Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples , 2019, NeurIPS.
[28] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[29] Thinh T. Doan,et al. Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation , 2019, SIAM J. Control. Optim..
[30] David Gamarnik,et al. Correlation Decay in Random Decision Networks , 2009, Math. Oper. Res..
[31] E. David,et al. Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .
[32] Qichao Zhang,et al. Reinforcement Learning and Deep Learning Based Lateral Control for Autonomous Driving [Application Notes] , 2019, IEEE Comput. Intell. Mag..
[33] Tor Lattimore,et al. PAC Bounds for Discounted MDPs , 2012, ALT.
[34] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[35] Quanquan Gu,et al. A Finite Time Analysis of Two Time-Scale Actor Critic Methods , 2020, NeurIPS.
[36] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[37] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[38] R. Srikant,et al. Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning , 2019, COLT.
[39] John N. Tsitsiklis,et al. The Complexity of Optimal Queuing Network Control , 1999, Math. Oper. Res..
[40] Devavrat Shah,et al. Q-learning with Nearest Neighbors , 2018, NeurIPS.
[41] Thinh T. Doan,et al. Finite-Time Analysis of Distributed TD(0) with Linear Function Approximation on Multi-Agent Reinforcement Learning , 2019, ICML.