LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning
暂无分享,去创建一个
Nicolas Perez Nieves | Yaodong Yang | Jun Wang | Taher Jafferjee | D. Mguni | Oliver Slumbers | Jiangcheng Zhu | Jianhong Wang | Feifei Tong | Yang Li
[1] Yu Wang,et al. The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games , 2021, NeurIPS.
[2] Tim C. Green,et al. Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks , 2021, NeurIPS.
[3] Yaodong Yang,et al. On the complexity of computing Markov perfect equilibrium in general-sum stochastic games , 2021, Electron. Colloquium Comput. Complex..
[4] Yaodong Yang,et al. Settling the Variance of Multi-Agent Policy Gradients , 2021, NeurIPS.
[5] Ying Wen,et al. Learning in Nonzero-Sum Stochastic Games with Potentials , 2021, ICML.
[6] Chongjie Zhang,et al. QPLEX: Duplex Dueling Multi-Agent Q-Learning , 2020, ICLR.
[7] Yunjie Gu,et al. Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System , 2020, ICLR.
[8] Goran Strbac,et al. Multi-Agent Reinforcement Learning for Automated Peer-to-Peer Energy Trading in Double-Side Auction Market , 2021, IJCAI.
[9] Yaodong Yang,et al. An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective , 2020, ArXiv.
[10] Dong Chen,et al. SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving , 2020, ArXiv.
[11] Yuk Ying Chung,et al. Learning Implicit Credit Assignment for Multi-Agent Actor-Critic , 2020, ArXiv.
[12] Lukas Schäfer,et al. Comparative Evaluation of Multi-Agent Deep Reinforcement Learning Algorithms , 2020, ArXiv.
[13] Yaodong Yang,et al. Multi-Agent Determinantal Q-Learning , 2020, ICML.
[14] Chongjie Zhang,et al. Towards Understanding Linear Value Decomposition in Cooperative Multi-Agent Q-Learning , 2020, ArXiv.
[15] Yunjie Gu,et al. Shapley Q-Value: A Local Reward Approach to Solve Global Reward Games , 2019, AAAI.
[16] Giovanni Montana,et al. Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication , 2019, Mach. Learn..
[17] Shimon Whiteson,et al. MAVEN: Multi-Agent Variational Exploration , 2019, NeurIPS.
[18] Yung Yi,et al. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning , 2019, ICML.
[19] David Mguni,et al. Cutting Your Losses: Learning Fault-Tolerant Control and Optimal Stopping under Adverse Risk , 2019, ArXiv.
[20] Shimon Whiteson,et al. The StarCraft Multi-Agent Challenge , 2019, AAMAS.
[21] S. Shreve,et al. Stochastic differential equations , 1955, Mathematical Proceedings of the Cambridge Philosophical Society.
[22] Jun Wang,et al. Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning , 2019, WWW.
[23] Sergio Valcarcel Macua,et al. Coordinating the Crowd: Inducing Desirable Equilibria in Non-Cooperative Systems , 2019, AAMAS.
[24] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[25] Murray Shanahan,et al. Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[26] Lei Han,et al. LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning , 2019, NeurIPS.
[27] Satinder Singh,et al. On Learning Intrinsic Rewards for Policy Gradient Methods , 2018, NeurIPS.
[28] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[29] D. Mguni,et al. A Viscosity Approach to Stochastic Differential Games of Control and Stopping Involving Impulsive Control , 2018, 1803.11432.
[30] Enrique Munoz de Cote,et al. Decentralised Learning in Systems with Many, Many Strategic Agents , 2018, AAAI.
[31] Santiago Zazo,et al. Learning Parametric Closed-Loop Policies for Markov Potential Games , 2018, ICLR.
[32] Lantao Yu,et al. A Study of AI Population Dynamics with Million-agent Reinforcement Learning , 2017, AAMAS.
[33] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[34] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[35] Sam Devlin,et al. Reward shaping for knowledge-based multi-objective multi-agent reinforcement learning , 2018, The Knowledge Engineering Review.
[36] Sam Devlin,et al. Policy invariance under reward transformations for multi-objective reinforcement learning , 2017, Neurocomputing.
[37] Gerhard Neumann,et al. Guided Deep Reinforcement Learning for Swarm Systems , 2017, ArXiv.
[38] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[39] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[40] Peng Peng,et al. Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games , 2017, 1703.10069.
[41] Jun Wang,et al. Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games , 2017, ArXiv.
[42] Marco Wiering,et al. Comparing exploration strategies for Q-learning in random stochastic mazes , 2016, 2016 IEEE Symposium Series on Computational Intelligence (SSCI).
[43] Traian Rebedea,et al. Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay , 2016, ArXiv.
[44] Christoph Manss,et al. Decentralized multi-agent exploration with online-learning of Gaussian processes , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[45] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[46] Sam Devlin,et al. Plan-based reward shaping for multi-agent reinforcement learning , 2016, The Knowledge Engineering Review.
[47] Sam Devlin,et al. Expressing Arbitrary Reward Functions as Potential-Based Advice , 2015, AAAI.
[48] Maryam Sadeghlou,et al. Dynamic agent-based reward shaping for multi-agent systems , 2014, 2014 Iranian Conference on Intelligent Systems (ICIS).
[49] Sam Devlin,et al. Dynamic potential-based reward shaping , 2012, AAMAS.
[50] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[51] Sam Devlin,et al. An Empirical Study of Potential-Based Reward Shaping and Advice in Complex, Multi-Agent Systems , 2011, Adv. Complex Syst..
[52] Sam Devlin,et al. Theoretical considerations of potential-based reward shaping for multi-agent systems , 2011, AAMAS.
[53] Erhan Bayraktar,et al. On the One-Dimensional Optimal Switching Problem , 2007, Math. Oper. Res..
[54] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .
[55] Julia Donaldson,et al. The big match , 2008 .
[56] T. Roughgarden,et al. Algorithmic Game Theory: Introduction to the Inefficiency of Equilibria , 2007 .
[57] Michael L. Littman,et al. Cyclic Equilibria in Markov Games , 2005, NIPS.
[58] John N. Tsitsiklis,et al. Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives , 1999, IEEE Trans. Autom. Control..
[59] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[60] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[61] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.