Collaborative Multiagent Reinforcement Learning by Payoff Propagation
暂无分享,去创建一个
[1] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[2] Umberto Bertelè,et al. Nonserial Dynamic Programming , 1972 .
[3] Judea Pearl,et al. Probabilistic reasoning in intelligent systems , 1988 .
[4] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[5] Makoto Yokoo,et al. Distributed Constraint Optimization as a Formal Model of Partially Adversarial Cooperation , 1991 .
[6] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.
[7] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[8] Sandip Sen,et al. Learning to Coordinate without Sharing Information , 1994, AAAI.
[9] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[10] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[11] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[12] Nevin Lianwen Zhang,et al. Exploiting Causal Independence in Bayesian Network Inference , 1996, J. Artif. Intell. Res..
[13] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[14] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[15] Hiroaki Kitano,et al. RoboCup: The Robot World Cup Initiative , 1997, AGENTS '97.
[16] Rina Dechter,et al. A Scheme for Approximating Probabilistic Inference , 1997, UAI.
[17] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[18] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[19] Michael I. Jordan,et al. Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.
[20] Andrew W. Moore,et al. Distributed Value Functions , 1999, ICML.
[21] Gerhard Weiss,et al. Multiagent Systems , 1999 .
[22] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[23] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.
[24] Wolfram Burgard,et al. Collaborative multi-robot exploration , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).
[25] Brendan J. Frey,et al. Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.
[26] Edmund H. Durfee,et al. Scaling Up Agent Coordination Strategies , 2001, Computer.
[27] Julie A. Adams,et al. Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence , 2001, AI Mag..
[28] Carlos Guestrin,et al. Multiagent Planning with Factored MDPs , 2001, NIPS.
[29] Shobha Venkataraman,et al. Context-specific multiagent coordination and planning with factored MDPs , 2002, AAAI/IAAI.
[30] Lynne E. Parker,et al. Editorial: Advances in Multi-Robot Systems , 2002 .
[31] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[32] Lynne E. Parker,et al. Distributed Algorithms for Multi-Robot Observation of Multiple Moving Targets , 2002, Auton. Robots.
[33] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..
[34] Lynne E. Parker,et al. Guest editorial advances in multirobot systems , 2002, IEEE Trans. Robotics Autom..
[35] William T. Freeman,et al. Understanding belief propagation and its generalizations , 2003 .
[36] Nikos Vlassis,et al. A Concise Introduction to Multiagent Systems and Distributed AI , 2003 .
[37] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[38] Avi Pfeffer,et al. Loopy Belief Propagation as a Basis for Communication in Sensor Networks , 2002, UAI.
[39] Claudia V. Goldman,et al. Optimizing information exchange in cooperative multi-agent systems , 2003, AAMAS '03.
[40] Milind Tambe,et al. Distributed Sensor Networks: A Multiagent Perspective , 2003 .
[41] D. Koller,et al. Planning under uncertainty in complex structured environments , 2003 .
[42] Benjamin Van Roy,et al. Distributed Optimization in Adaptive Networks , 2003, NIPS.
[43] Craig Boutilier,et al. Coordination in multiagent reinforcement learning: a Bayesian approach , 2003, AAMAS '03.
[44] Claudia V. Goldman,et al. Transition-independent decentralized markov decision processes , 2003, AAMAS '03.
[45] Martin J. Wainwright,et al. Tree consistency and bounds on the performance of the max-product algorithm and its generalizations , 2004, Stat. Comput..
[46] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[47] Nikos A. Vlassis,et al. Sparse cooperative Q-learning , 2004, ICML.
[48] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[49] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[50] Nikos A. Vlassis,et al. Anytime algorithms for multiagent decision making using coordination graphs , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).
[51] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.
[52] Claudia V. Goldman,et al. Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis , 2004, J. Artif. Intell. Res..
[53] H.-A. Loeliger,et al. An introduction to factor graphs , 2004, IEEE Signal Process. Mag..
[54] Nicholas R. Jennings,et al. Cooperative Information Sharing to Improve Distributed Learning in Multi-Agent Systems , 2005, J. Artif. Intell. Res..
[55] Peter Stone,et al. Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..
[56] Milind Tambe,et al. Preprocessing techniques for accelerating the DCOP algorithm ADOPT , 2005, AAMAS '05.
[57] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[58] Nikos A. Vlassis,et al. Non-communicative multi-robot coordination in dynamic environments , 2005, Robotics Auton. Syst..
[59] Nikos A. Vlassis,et al. Using the Max-Plus Algorithm for Multiagent Decision Making in Coordination Graphs , 2005, BNAIC.
[60] Makoto Yokoo,et al. Adopt: asynchronous distributed constraint optimization with quality guarantees , 2005, Artif. Intell..
[61] J. R. Kok,et al. Cooperation and learning in cooperative multiagent systems , 2006 .
[62] Agostino Poggi,et al. Multiagent Systems , 2006, Intelligenza Artificiale.