暂无分享,去创建一个
[1] Marvin Minsky,et al. Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.
[2] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[3] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[4] D. Kahneman,et al. Heuristics and Biases: The Psychology of Intuitive Judgment , 2002 .
[5] Yi Wu,et al. Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient , 2019, AAAI.
[6] Stephen Clark,et al. Emergent Communication through Negotiation , 2018, ICLR.
[7] Tamer Basar,et al. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.
[8] Sonia Chernova,et al. Learning from Demonstration for Shaping through Inverse Reinforcement Learning , 2016, AAMAS.
[9] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.
[10] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[11] Tiejian Luo,et al. Learning to Communicate via Supervised Attentional Message Processing , 2018, CASA.
[12] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[13] Kagan Tumer,et al. Modeling difference rewards for multiagent learning , 2012, AAMAS.
[14] Kenneth O. Stanley,et al. ES is more than just a traditional finite-difference approximator , 2017, GECCO.
[15] Shlomo Zilberstein,et al. Dynamic Programming Approximations for Partially Observable Stochastic Games , 2009, FLAIRS.
[16] Saeid Nahavandi,et al. Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications , 2018, IEEE Transactions on Cybernetics.
[17] Michael H. Bowling,et al. Evaluating state-space abstractions in extensive-form games , 2013, AAMAS.
[18] Masayoshi Tomizuka,et al. Learning Human Rewards by Inferring Their Latent Intelligence Levels in Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[19] Joelle Pineau,et al. An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.
[20] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[21] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[22] Stephen J. Roberts,et al. Learning Against Non-Stationary Agents with Opponent Modelling and Deep Reinforcement Learning , 2018, AAAI Spring Symposia.
[23] Emil Gustavsson,et al. Learning to Play Guess Who? and Inventing a Grounded Language as a Consequence , 2016, ArXiv.
[24] Jorge Gomes,et al. Dynamic Team Heterogeneity in Cooperative Coevolutionary Algorithms , 2018, IEEE Transactions on Evolutionary Computation.
[25] Kagan Tumer,et al. Multi-objective Multiagent Credit Assignment Through Difference Rewards in Reinforcement Learning , 2014, SEAL.
[26] Dorian Kodelja,et al. Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.
[27] Kevin R. McKee,et al. Neural Recursive Belief States in Multi-Agent Reinforcement Learning , 2021, ArXiv.
[28] Yan Zheng,et al. Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments , 2018, PRICAI.
[29] David Silver,et al. Fictitious Self-Play in Extensive-Form Games , 2015, ICML.
[30] Toshiharu Sugawara,et al. Learning to Coordinate with Deep Reinforcement Learning in Doubles Pong Game , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).
[31] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[32] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[33] Yuk Ying Chung,et al. Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning , 2020, NeurIPS.
[34] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[35] Karl Tuyls,et al. Evolutionary Dynamics of Multi-Agent Learning: A Survey , 2015, J. Artif. Intell. Res..
[36] Praveen Palanisamy,et al. Multi-Agent Connected Autonomous Driving using Deep Reinforcement Learning , 2019, 2020 International Joint Conference on Neural Networks (IJCNN).
[37] J. Dovidio,et al. Helping behavior and altruism: an empirical and conceptual overview , 1984 .
[38] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[39] Matthew E. Taylor,et al. A Bayesian Approach for Learning and Tracking Switching, Non-Stationary Opponents: (Extended Abstract) , 2016, AAMAS.
[40] A. Colman. Cooperation, psychological game theory, and limitations of rationality in social interaction , 2003, Behavioral and Brain Sciences.
[41] Yaodong Yang,et al. An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective , 2020, ArXiv.
[42] O. H. Brownlee,et al. ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .
[43] Wenlong Fu,et al. Model-based reinforcement learning: A survey , 2018 .
[44] Martijn C. Schut,et al. Evolving team behaviors with specialization , 2012, Genetic Programming and Evolvable Machines.
[45] Youngchul Sung,et al. Message-Dropout: An Efficient Training Method for Multi-Agent Deep Reinforcement Learning , 2019, AAAI.
[46] Fei Sha,et al. Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.
[47] Wenhang Bao,et al. Multi-Agent Deep Reinforcement Learning for Liquidation Strategy Analysis , 2019, ArXiv.
[48] G. Tesauro,et al. Learning Hierarchical Teaching Policies for Cooperative Agents , 2019, AAMAS.
[49] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[50] Kevin Waugh,et al. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.
[51] Kenneth O. Stanley,et al. Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning , 2017, ArXiv.
[52] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[53] Peter Stone,et al. Autonomous agents modelling other agents: A comprehensive survey and open problems , 2017, Artif. Intell..
[54] Athanasios S. Polydoros,et al. Survey of Model-Based Reinforcement Learning: Applications on Robotics , 2017, J. Intell. Robotic Syst..
[55] Shimon Whiteson,et al. MAVEN: Multi-Agent Variational Exploration , 2019, NeurIPS.
[56] Kenneth O. Stanley,et al. On the Relationship Between the OpenAI Evolution Strategy and Stochastic Gradient Descent , 2017, ArXiv.
[57] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[58] Hongyi Zhou,et al. MAPPER: Multi-Agent Path Planning with Evolutionary Reinforcement Learning in Mixed Dynamic Environments , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[59] Noam Brown,et al. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals , 2018, Science.
[60] Alexander Peysakhovich,et al. Multi-Agent Cooperation and the Emergence of (Natural) Language , 2016, ICLR.
[61] A. Goldman. To Appear in: , 2008 .
[62] Sam Devlin,et al. Potential-based difference rewards for multiagent reinforcement learning , 2014, AAMAS.
[63] Jonathan P. How,et al. Learning to Teach in Cooperative Multiagent Reinforcement Learning , 2018, AAAI.
[64] Yung Yi,et al. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning , 2019, ICML.
[65] Pablo Hernandez-Leal,et al. A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity , 2017, ArXiv.
[66] Sam Devlin,et al. Theoretical considerations of potential-based reward shaping for multi-agent systems , 2011, AAMAS.
[67] Yujing Hu,et al. Q-value Path Decomposition for Deep Multiagent Reinforcement Learning , 2020, ICML.
[68] Srikanth Kandula,et al. Resource Management with Deep Reinforcement Learning , 2016, HotNets.
[69] Tom Eccles,et al. Learning Reciprocity in Complex Sequential Social Dilemmas , 2019, ArXiv.
[70] José M. F. Moura,et al. Natural Language Does Not Emerge ‘Naturally’ in Multi-Agent Dialog , 2017, EMNLP.
[71] Kian Hsiang Low,et al. R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games , 2020, ICML.
[72] David P. Landau,et al. Phase transitions and critical phenomena , 1989, Computing in Science & Engineering.
[73] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[74] Filippos Christianos,et al. Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning , 2019, ArXiv.
[75] Diego Perez Liebana,et al. Teaching on a Budget in Multi-Agent Deep Reinforcement Learning , 2019, 2019 IEEE Conference on Games (CoG).
[76] Joelle Pineau,et al. On the Pitfalls of Measuring Emergent Communication , 2019, AAMAS.
[77] Jonathan P. How,et al. R-MADDPG for Partially Observable Environments and Limited Communication , 2019, ArXiv.
[78] Felipe Leno da Silva,et al. A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems , 2019, J. Artif. Intell. Res..
[79] Sam Devlin,et al. Difference Rewards Policy Gradients , 2020, AAMAS.
[80] Ladislau Bölöni,et al. Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).
[81] David Hsu,et al. DESPOT: Online POMDP Planning with Regularization , 2013, NIPS.
[82] Stefan Lee,et al. Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[83] J. K. Terry,et al. Revisiting Parameter Sharing in Multi-Agent Deep Reinforcement Learning , 2020 .
[84] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[85] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[86] Alexander Nareyek,et al. Choosing search heuristics by non-stationary reinforcement learning , 2004 .
[87] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[88] Ofra Amir,et al. Interactive Teaching Strategies for Agent Training , 2016, IJCAI.
[89] Alexander Peysakhovich,et al. Prosocial Learning Agents Solve Generalized Stag Hunts Better than Selfish Ones Extended Abstract , 2018 .
[90] Joel Z. Leibo,et al. Social Diversity and Social Preferences in Mixed-Motive Reinforcement Learning , 2020, AAMAS.
[91] SchwefelHans-Paul,et al. An overview of evolutionary algorithms for parameter optimization , 1993 .
[92] Peng Peng,et al. Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games , 2017, 1703.10069.
[93] Julian Togelius,et al. AlphaStar: an evolutionary computation perspective , 2019, GECCO.
[94] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[95] Jun Wang,et al. Multi-Agent Reinforcement Learning , 2020, Deep Reinforcement Learning.
[96] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[97] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[98] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[99] Ivan Titov,et al. Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols , 2017, NIPS.
[100] Bikramjit Banerjee,et al. Multi-agent reinforcement learning as a rehearsal for decentralized planning , 2016, Neurocomputing.
[101] Frans A. Oliehoek,et al. Scalable Planning and Learning for Multiagent POMDPs , 2014, AAAI.
[102] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[103] Noam Brown,et al. Superhuman AI for multiplayer poker , 2019, Science.
[104] Angelo Cangelosi,et al. Hierarchical reinforcement learning as creative problem solving , 2016, Robotics Auton. Syst..
[105] Tianshu Chu,et al. Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control , 2019, IEEE Transactions on Intelligent Transportation Systems.
[106] John J. Grefenstette,et al. Evolutionary Algorithms for Reinforcement Learning , 1999, J. Artif. Intell. Res..
[107] Thomas Bäck,et al. An Overview of Evolutionary Algorithms for Parameter Optimization , 1993, Evolutionary Computation.
[108] Matthew Hausknecht and Peter Stone,et al. Grounded Semantic Networks for Learning Shared Communication Protocols , 2016 .
[109] Jordan L. Boyd-Graber,et al. Opponent Modeling in Deep Reinforcement Learning , 2016, ICML.
[110] Manuela M. Veloso,et al. Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.
[111] Kenneth O. Stanley,et al. Safe mutations for deep and recurrent neural networks through output gradients , 2017, GECCO.
[112] Joel Z. Leibo,et al. Inequity aversion improves cooperation in intertemporal social dilemmas , 2018, NeurIPS.
[113] Madalina M. Drugan,et al. Reinforcement learning versus evolutionary computation: A survey on hybrid algorithms , 2019, Swarm Evol. Comput..
[114] Shimon Whiteson,et al. Multi-Agent Common Knowledge Reinforcement Learning , 2018, NeurIPS.
[115] E. Fehr. A Theory of Fairness, Competition and Cooperation , 1998 .
[116] Felipe Leno da Silva,et al. Simultaneously Learning and Advising in Multiagent Reinforcement Learning , 2017, AAMAS.
[117] W. Hamilton,et al. The evolution of cooperation. , 1984, Science.
[118] Xiangyu Liu,et al. ACCNet: Actor-Coordinator-Critic Net for "Learning-to-Communicate" with Deep Multi-agent Reinforcement Learning , 2017, ArXiv.
[119] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[120] Hoong Chuin Lau,et al. Credit Assignment For Collective Multiagent RL With Global Rewards , 2018, NeurIPS.
[121] Wojciech Jaskowski,et al. Heterogeneous team deep q-learning in low-dimensional multi-agent environments , 2016, 2016 IEEE Conference on Computational Intelligence and Games (CIG).
[122] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[123] Kenneth O. Stanley,et al. Exploiting Open-Endedness to Solve Problems Through the Search for Novelty , 2008, ALIFE.
[124] Shauharda Khadka,et al. Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination , 2019, ICML.
[125] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .
[126] Alexander Peysakhovich,et al. Maintaining cooperation in complex social dilemmas using deep reinforcement learning , 2017, ArXiv.
[127] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2019, Autonomous Agents and Multi-Agent Systems.
[128] Joel Z. Leibo,et al. Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.
[129] Shimon Whiteson,et al. Weighted QMIX: Expanding Monotonic Value Function Factorisation , 2020, NeurIPS.
[130] Sam Devlin,et al. Dynamic potential-based reward shaping , 2012, AAMAS.
[131] Nando de Freitas,et al. Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning , 2018, ICML.
[132] Anders Lyhne Christensen,et al. Avoiding convergence in cooperative coevolution with novelty search , 2014, AAMAS.
[133] Weinan Zhang,et al. Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising , 2018, CIKM.
[134] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[135] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[136] Igor Mordatch,et al. Emergent Tool Use From Multi-Agent Autocurricula , 2019, ICLR.
[137] Hangyu Mao,et al. Learning multi-agent communication with double attentional deep reinforcement learning , 2020, Autonomous Agents and Multi-Agent Systems.
[138] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[139] G Gigerenzer,et al. Reasoning the fast and frugal way: models of bounded rationality. , 1996, Psychological review.
[140] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[141] Zongqing Lu,et al. Learning Attentional Communication for Multi-Agent Cooperation , 2018, NeurIPS.
[142] Kenneth O. Stanley,et al. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.
[143] R. Bellman. A Markovian Decision Process , 1957 .
[144] Marc Peter Deisenroth,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.
[145] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[146] Mykel J. Kochenderfer,et al. Cooperative Multi-agent Control Using Deep Reinforcement Learning , 2017, AAMAS Workshops.
[147] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[148] Shimon Whiteson,et al. Learning with Opponent-Learning Awareness , 2017, AAMAS.
[149] D. Premack,et al. Does the chimpanzee have a theory of mind? , 1978, Behavioral and Brain Sciences.
[150] Huaimin Wang,et al. Attention-Based Fault-Tolerant Approach for Multi-Agent Reinforcement Learning Systems , 2019, Entropy.
[151] Neil Burch,et al. Heads-up limit hold’em poker is solved , 2015, Science.
[152] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[153] Lei Han,et al. LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning , 2019, NeurIPS.
[154] Klaus Diepold,et al. Multi-agent deep reinforcement learning: a survey , 2021, Artificial Intelligence Review.
[155] Gerd Gigerenzer,et al. Good Judgments Do Not Require Complex Cognition , 2008 .
[156] Yan Zheng,et al. A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents , 2018, NeurIPS.
[157] Jianye Hao,et al. Towards Cooperation in Sequential Prisoner's Dilemmas: a Deep Multiagent Reinforcement Learning Approach , 2018, ArXiv.
[158] Zhang-Wei Hong,et al. A Deep Policy Inference Q-Network for Multi-Agent Systems , 2017, AAMAS.