Multi-agent reinforcement learning as a rehearsal for decentralized planning
暂无分享,去创建一个
[1] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[2] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.
[3] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[4] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[5] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[6] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[7] C. Boutilier,et al. Accelerating Reinforcement Learning through Implicit Imitation , 2003, J. Artif. Intell. Res..
[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[9] François Charpillet,et al. Point-based Dynamic Programming for DEC-POMDPs , 2006, AAAI.
[10] Peter Stone,et al. Value-Function-Based Transfer for Reinforcement Learning Using Structure Mapping , 2006, AAAI.
[11] Bikramjit Banerjee,et al. General Game Learning Using Knowledge Transfer , 2007, IJCAI.
[12] Shlomo Zilberstein,et al. Memory-Bounded Dynamic Programming for DEC-POMDPs , 2007, IJCAI.
[13] Nikos A. Vlassis,et al. Q-value Heuristics for Approximate Solutions of Dec-POMDPs , 2007, AAAI Spring Symposium: Game Theoretic and Decision Theoretic Agents.
[14] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
[15] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[16] Shimon Whiteson,et al. Lossless clustering of histories in decentralized POMDPs , 2009, AAMAS.
[17] Brahim Chaib-draa,et al. Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs , 2009, AAMAS.
[18] Shlomo Zilberstein,et al. Incremental Policy Generation for Finite-Horizon DEC-POMDPs , 2009, ICAPS.
[19] Alain Dutech,et al. An Investigation into Mathematical Programming for Finite Horizon Decentralized POMDPs , 2014, J. Artif. Intell. Res..
[20] Feng Wu,et al. Rollout Sampling Policy Iteration for Decentralized POMDPs , 2010, UAI.
[21] Peter Stone,et al. Combining manual feedback with subsequent MDP reward signals for reinforcement learning , 2010, AAMAS.
[22] Frans A. Oliehoek,et al. Heuristic search for identical payoff Bayesian games , 2010, AAMAS.
[23] Frans A. Oliehoek,et al. Scaling Up Optimal Heuristic Search in Dec-POMDPs via Incremental Expansion , 2011, IJCAI.
[24] Victor R. Lesser,et al. Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs , 2011, AAAI.
[25] Amir Massoud Farahmand,et al. Action-Gap Phenomenon in Reinforcement Learning , 2011, NIPS.
[26] Bikramjit Banerjee,et al. Sample Bounded Distributed Reinforcement Learning for Decentralized POMDPs , 2012, AAAI.
[27] Bikramjit Banerjee,et al. Informed Initial Policies for Learning in Dec-POMDPs , 2012, AAAI.
[28] Bikramjit Banerjee,et al. Concurrent reinforcement learning as a rehearsal for decentralized planning under uncertainty , 2013, AAMAS.