Likelihood Quantile Networks for Coordinating Multi-Agent Reinforcement Learning
暂无分享,去创建一个
[1] Mohammad Ghavamzadeh,et al. Algorithms for CVaR Optimization in MDPs , 2014, NIPS.
[2] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[3] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[4] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[5] Sean Luke,et al. Lenient Learning in Independent-Learner Stochastic Cooperative Games , 2016, J. Mach. Learn. Res..
[6] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[7] Masashi Sugiyama,et al. Nonparametric Return Distribution Approximation for Reinforcement Learning , 2010, ICML.
[8] Frans A. Oliehoek,et al. A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.
[9] M. Yaari. The Dual Theory of Choice under Risk , 1987 .
[10] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[11] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[12] Sean Luke,et al. Lenient learners in cooperative multiagent systems , 2006, AAMAS '06.
[13] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[14] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[15] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[16] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[18] Jonathan P. How,et al. Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability , 2017, ICML.
[19] Rahul Savani,et al. Negative Update Intervals in Deep Multi-Agent Reinforcement Learning , 2018, AAMAS.
[20] Dan Ventura,et al. Predicting and Preventing Coordination Problems in Cooperative Q-learning Systems , 2007, IJCAI.
[21] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[22] Jonathan P. How,et al. Learning to Teach in Cooperative Multiagent Reinforcement Learning , 2018, AAAI.
[23] Rahul Savani,et al. Lenient Multi-Agent Deep Reinforcement Learning , 2017, AAMAS.
[24] Shlomo Zilberstein,et al. Incremental Policy Generation for Finite-Horizon DEC-POMDPs , 2009, ICAPS.
[25] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[26] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.
[27] Shaun S. Wang. A CLASS OF DISTORTION OPERATORS FOR PRICING FINANCIAL AND INSURANCE RISKS , 2000 .
[28] Guillaume J. Laurent,et al. Hysteretic q-learning :an algorithm for decentralized reinforcement learning in cooperative multi-agent teams , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[29] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[30] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[31] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..