暂无分享,去创建一个
[1] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[2] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[3] Gerhard Lakemeyer,et al. Exploring artificial intelligence in the new millennium , 2003 .
[4] Tiejun Huang,et al. Graph Convolutional Reinforcement Learning , 2020, ICLR.
[5] Daniel Kudenko,et al. MAGNet: Multi-agent Graph Network for Deep Multi-agent Reinforcement Learning , 2019, 2019 XVI International Symposium "Problems of Redundancy in Information and Control Systems" (REDUNDANCY).
[6] Frans A. Oliehoek,et al. Coordinated Deep Reinforcement Learners for Traffic Light Control , 2016 .
[7] Nikos A. Vlassis,et al. Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..
[8] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[9] Ming Zhou,et al. Mean Field Multi-Agent Reinforcement Learning , 2018, ICML.
[10] Yung Yi,et al. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning , 2019, ICML.
[11] Mariagrazia Dotoli,et al. Advanced control in factory automation: a survey , 2017, Int. J. Prod. Res..
[12] H. Francis Song,et al. Relational Forward Models for Multi-Agent Learning , 2018, ICLR.
[13] V. S. Glukhov,et al. Idiosyncrasies and challenges of data driven learning in electronic trading , 2018, 1811.09549.
[14] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[15] Drew Wicke,et al. Multiagent Soft Q-Learning , 2018, AAAI Spring Symposia.
[16] Shobha Venkataraman,et al. Context-specific multiagent coordination and planning with factored MDPs , 2002, AAAI/IAAI.
[17] William T. Freeman,et al. Understanding belief propagation and its generalizations , 2003 .
[18] Javier Alonso-Mora,et al. Multi-robot formation control and object transport in dynamic environments via constrained optimization , 2017, Int. J. Robotics Res..
[19] Xi Chen,et al. Learning From Demonstration in the Wild , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[20] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[21] R. Paul Wiegand,et al. Biasing Coevolutionary Search for Optimal Multiagent Behaviors , 2006, IEEE Transactions on Evolutionary Computation.
[22] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[23] Carlos M. Correa-Posada,et al. Integrated Power and Natural Gas Model for Energy Adequacy in Short-Term Operation , 2015, IEEE Transactions on Power Systems.
[24] Daphne Koller,et al. Computing Factored Value Functions for Policies in Structured MDPs , 1999, IJCAI.
[25] Yujing Hu,et al. Multi-Agent Game Abstraction via Graph Attention Neural Network , 2019, AAAI.
[26] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[27] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[28] Davood Hajinezhad,et al. A Review of Cooperative Multi-Agent Deep Reinforcement Learning , 2019, ArXiv.
[29] Shimon Whiteson,et al. The StarCraft Multi-Agent Challenge , 2019, AAMAS.
[30] Nicholas R. Jennings,et al. Bounded approximate decentralised coordination via the max-sum algorithm , 2009, Artif. Intell..
[31] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[32] Frans A. Oliehoek,et al. A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.
[33] Shimon Whiteson,et al. The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning , 2019, AAMAS.
[34] Roie Zivan,et al. Applying max-sum to teams of mobile sensing agents , 2018, Eng. Appl. Artif. Intell..
[35] Shimon Whiteson,et al. Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning , 2019, ArXiv.
[36] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[37] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[38] Avi Pfeffer,et al. Loopy Belief Propagation as a Basis for Communication in Sensor Networks , 2002, UAI.
[39] Michael I. Jordan,et al. Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.
[40] Ying Wen,et al. Factorized Q-learning for large-scale multi-agent systems , 2018, DAI.
[41] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.
[42] M. Stanković. Multi-agent reinforcement learning , 2016 .
[43] Shimon Whiteson,et al. Multi-Agent Common Knowledge Reinforcement Learning , 2018, NeurIPS.
[44] Bikramjit Banerjee,et al. Multi-agent reinforcement learning as a rehearsal for decentralized planning , 2016, Neurocomputing.
[45] Martin J. Wainwright,et al. Tree consistency and bounds on the performance of the max-product algorithm and its generalizations , 2004, Stat. Comput..
[46] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[47] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[48] Emilio Frazzoli,et al. On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment , 2017, Proceedings of the National Academy of Sciences.
[49] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[50] Razvan Pascanu,et al. Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.