暂无分享,去创建一个
Nicola Gatti | Marco Ciccone | Andrea Celli | Raffaele Bongo | Marco Ciccone | N. Gatti | A. Celli | Raffaele Bongo
[1] Tuomas Sandholm,et al. Safe and Nested Subgame Solving for Imperfect-Information Games , 2017, NIPS.
[2] Guillaume J. Laurent,et al. Hysteretic q-learning :an algorithm for decentralized reinforcement learning in cooperative multi-agent teams , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[3] Bernhard von Stengel,et al. Extensive-Form Correlated Equilibrium: Definition and Computational Complexity , 2008, Math. Oper. Res..
[4] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[5] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2019, Autonomous Agents and Multi-Agent Systems.
[6] David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.
[7] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[8] R. Aumann. Subjectivity and Correlation in Randomized Strategies , 1974 .
[9] Victor R. Lesser,et al. Coordinating multi-agent reinforcement learning with limited communication , 2013, AAMAS.
[10] Boi Faltings,et al. Decentralized Anti-coordination Through Multi-agent Learning , 2013, J. Artif. Intell. Res..
[11] Ming Zhou,et al. Signal Instructed Coordination in Team Competition , 2019, ArXiv.
[12] Tuomas Sandholm,et al. Ex ante coordination and collusion in zero-sum multi-player extensive-form games , 2018, NeurIPS.
[13] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[14] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
[15] Boi Faltings,et al. Reaching correlated equilibria through multi-agent learning , 2011, AAMAS.
[16] Jonathan P. How,et al. Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability , 2017, ICML.
[17] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[18] Kevin Waugh,et al. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.
[19] Michael H. Bowling,et al. Regret Minimization in Games with Incomplete Information , 2007, NIPS.
[20] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[21] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[22] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[23] Craig Boutilier,et al. Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.
[24] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[25] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[26] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[27] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[28] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[29] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[30] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[31] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[32] Michael H. Bowling,et al. Bayes' Bluff: Opponent Modelling in Poker , 2005, UAI 2005.
[33] Nicola Basilico,et al. Team-Maxmin Equilibrium: Efficiency Bounds and Algorithms , 2016, AAAI.
[34] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[35] Manuela M. Veloso,et al. Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.
[36] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[37] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[38] J. Nash. NON-COOPERATIVE GAMES , 1951, Classics in Game Theory.
[39] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[40] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[41] K. Tuyls,et al. Lenient Frequency Adjusted Q-learning , 2010 .
[42] Martin Lauer,et al. Reinforcement learning for stochastic cooperative multi-agent-systems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[43] Guillaume J. Laurent,et al. A study of FMQ heuristic in cooperative multi-agent games , 2008, AAMAS 2008.
[44] Xiangxiang Chu,et al. Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning , 2017, ArXiv.
[45] Nicola Gatti,et al. Computational Results for Extensive-Form Adversarial Team Games , 2017, AAAI.
[46] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[47] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[48] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.
[49] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[50] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[51] Noam Brown,et al. Superhuman AI for multiplayer poker , 2019, Science.
[52] Noam Brown,et al. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals , 2018, Science.
[53] Bikramjit Banerjee,et al. Multi-agent reinforcement learning as a rehearsal for decentralized planning , 2016, Neurocomputing.
[54] Guy Lever,et al. Emergent Coordination Through Competition , 2019, ICLR.
[55] Tuomas Sandholm,et al. Deep Counterfactual Regret Minimization , 2018, ICML.
[56] J. Robinson. AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.
[57] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[58] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[59] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[60] Rahul Savani,et al. Lenient Multi-Agent Deep Reinforcement Learning , 2017, AAMAS.
[61] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .
[62] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[63] B. Stengel,et al. Team-Maxmin Equilibria☆ , 1997 .