A Comprehensive Survey of Multiagent Reinforcement Learning
暂无分享,去创建一个
[1] J M Smith,et al. Evolution and the theory of games , 1976 .
[2] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .
[3] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[4] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[5] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[6] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.
[7] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[8] Kenneth A. De Jong,et al. A Cooperative Coevolutionary Approach to Function Optimization , 1994, PPSN.
[9] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[10] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[11] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[12] Sandip Sen,et al. Learning to Coordinate without Sharing Information , 1994, AAAI.
[13] David Carmel,et al. Opponent Modeling in Multi-Agent Systems , 1995, Adaption and Learning in Multi-Agent Systems.
[14] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[15] Maja J. Mataric,et al. Learning in Multi-Robot Systems , 1995, Adaption and Learning in Multi-Agent Systems.
[16] Sandip Sen,et al. Strongly Typed Genetic Programming in Evolving Cooperation Strategies , 1995, ICGA.
[17] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[18] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[19] Dit-Yan Yeung,et al. Predictive Q-Routing: A Memory-based Reinforcement Learning Approach to Adaptive Traffic Control , 1995, NIPS.
[20] Moshe Tennenholtz,et al. Adaptive Load Balancing: A Study in Multi-Agent Learning , 1994, J. Artif. Intell. Res..
[21] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[22] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[23] Thomas Bäck,et al. Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .
[24] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[25] Juergen Schmidhuber,et al. A General Method For Incremental Self-Improvement And Multi-Agent Learning In Unrestricted Environme , 1999 .
[26] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[27] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.
[28] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[29] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[30] T. Başar,et al. Dynamic Noncooperative Game Theory, 2nd Edition , 1998 .
[31] Victor R. Lesser,et al. Learning organizational roles for negotiated search in a multiagent system , 1998, Int. J. Hum. Comput. Stud..
[32] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[33] Sandip Sen,et al. Learning in multiagent systems , 1999 .
[34] H. Van Dyke Parunak,et al. Industrial and practical applications of DAI , 1999 .
[35] Manuela M. Veloso,et al. Team-partitioned, opaque-transition reinforcement learning , 1999, AGENTS '99.
[36] Craig Boutilier,et al. Implicit Imitation in Multiagent Reinforcement Learning , 1999, ICML.
[37] Jürgen Schmidhuber,et al. Reinforcement Learning Soccer Teams with Incomplete World Models , 1999, Auton. Robots.
[38] Geoffrey E. Hinton,et al. Unsupervised learning : foundations of neural computation , 1999 .
[39] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[40] Martin A. Riedmiller,et al. Reinforcement Learning for Cooperating and Communicating Reactive Agents in Electrical Power Grids , 2000, Balancing Reactivity and Social Deliberation in Multi-Agent Systems.
[41] Manuela Veloso,et al. An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning , 2000 .
[42] Gerhard Weiss. Industrial and Practical Applications of DAI , 2000 .
[43] Marco Wiering,et al. Multi-Agent Reinforcement Learning for Traffic Light control , 2000 .
[44] Claude F. Touzet,et al. Robot Awareness in Cooperative Mobile Robot Learning , 2000, Auton. Robots.
[45] Michael H. Bowling,et al. Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.
[46] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.
[47] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[48] Martin A. Riedmiller,et al. Karlsruhe Brainstormers - A Reinforcement Learning Approach to Robotic Soccer , 2000, RoboCup.
[49] Jordan B. Pollack,et al. A Game-Theoretic Approach to the Simple Coevolutionary Algorithm , 2000, PPSN.
[50] Klaus Debes,et al. A reinforcement learning based neural multiagent system for control of a combustion process , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[51] Guillermo Ricardo Simari,et al. Multiagent systems: a modern approach to distributed artificial intelligence , 2000 .
[52] Reda Alhajj,et al. Multiagent reinforcement learning using function approximation , 2000, IEEE Trans. Syst. Man Cybern. Part C.
[53] Peter Stone,et al. Implicit Negotiation in Repeated Games , 2001, ATAL.
[54] Von-Wun Soo,et al. Market Performance of Adaptive Trading Agents in Synchronous Double Auctions , 2001, PRIMA.
[55] Manuela M. Veloso,et al. Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.
[56] DeLiang Wang,et al. Unsupervised Learning: Foundations of Neural Computation , 2001, AI Mag..
[57] Michael L. Littman,et al. Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.
[58] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[59] Daniel Kudenko,et al. Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.
[60] Xiaofeng Wang,et al. Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.
[61] Bernard Manderick,et al. Q-Learning in Simulated Robotic Soccer - Large State Spaces and Incomplete Information , 2002, ICMLA.
[62] Byoung-Tak Zhang,et al. Stock Trading System Using Reinforcement Learning with Cooperative Agents , 2002, ICML.
[63] Jae Won Lee,et al. A Multi-agent Q-learning Framework for Optimizing Stock Trading Systems , 2002, DEXA.
[64] José M. Vidal,et al. Learning in Multiagent Systems: An Introduction from a Game-Theoretic Perspective , 2003, Adaptive Agents and Multi-Agents Systems.
[65] Matthijs T. J. Spaan,et al. High level coordination of agents based on multiagent Markov decision processes with roles , 2002 .
[66] Akira Hayashi,et al. A multiagent reinforcement learning algorithm using extended optimal response , 2002, AAMAS '02.
[67] Michael P. Wellman,et al. The 2001 trading agent competition , 2002, Electron. Mark..
[68] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[69] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..
[70] Georgios Chalkiadakis. Multiagent reinforcement learning: stochastic games with multiple learning players , 2003 .
[71] Nikos Vlassis,et al. A Concise Introduction to Multiagent Systems and Distributed AI , 2003 .
[72] Yukinori Kakazu,et al. An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning , 2003, Robotics Auton. Syst..
[73] C. Boutilier,et al. Accelerating Reinforcement Learning through Implicit Imitation , 2003, J. Artif. Intell. Res..
[74] Bikramjit Banerjee,et al. Adaptive policy gradient in multiagent learning , 2003, AAMAS '03.
[75] Gerald Tesauro,et al. Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.
[76] Yoav Shoham,et al. Multi-Agent Reinforcement Learning:a critical survey , 2003 .
[77] Ville Könönen,et al. Gradient Based Method for Symmetric and Asymmetric Multiagent Reinforcement Learning , 2003, IDEAL.
[78] Manuela Veloso,et al. Multiagent learning in the presence of agents with limitations , 2003 .
[79] William T. B. Uther,et al. Adversarial Reinforcement Learning , 2003 .
[80] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[81] R. Paul Wiegand,et al. Improving Coevolutionary Search for Optimal Multiagent Behaviors , 2003, IJCAI.
[82] Y. Narahari,et al. Reinforcement learning applications in dynamic pricing of retail markets , 2003, EEE International Conference on E-Commerce, 2003. CEC 2003..
[83] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[84] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[85] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[86] Thomas Miconi. When Evolving Populations is Better than Coevolving Individuals: The Blind Mice Problem , 2003, IJCAI.
[87] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[88] Sridhar Mahadevan,et al. Hierarchical Multiagent Reinforcement Learning , 2004 .
[89] Daniel Kudenko,et al. Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[90] Q. Henry Wu,et al. Multi-agent learning for routing control within an Internet environment , 2004, Eng. Appl. Artif. Intell..
[91] Nikos A. Vlassis,et al. Sparse cooperative Q-learning , 2004, ICML.
[92] Yoav Shoham,et al. New Criteria and a New Algorithm for Learning in Multi-Agent Systems , 2004, NIPS.
[93] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[94] Jürgen Schmidhuber,et al. Learning Team Strategies: Soccer Case Studies , 1998, Machine Learning.
[95] Jeffrey O. Kephart,et al. Pricing in Agent Economies Using Multi-Agent Q-Learning , 2002, Autonomous Agents and Multi-Agent Systems.
[96] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[97] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[98] Jeffrey S. Rosenschein,et al. Best-response multiagent learning in non-stationary environments , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[99] Michael H. Bowling,et al. Convergence and No-Regret in Multiagent Learning , 2004, NIPS.
[100] Ville Könönen,et al. Asymmetric multiagent reinforcement learning , 2003, Web Intell. Agent Syst..
[101] Andrew W. Moore,et al. Variable Resolution Discretization in Optimal Control , 2002, Machine Learning.
[102] Felix A. Fischer,et al. Hierarchical reinforcement learning in communication-mediated multiagent coordination , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[103] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.
[104] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[105] Andrew G. Barto,et al. Elevator Group Control Using Multiple Reinforcement Learning Agents , 1998, Machine Learning.
[106] William D. Smart,et al. Interpolation-based Q-learning , 2004, ICML.
[107] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[108] Mohamed S. Kamel,et al. Learning Coordination Strategies for Cooperative Multiagent Systems , 1998, Machine Learning.
[109] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[110] Csaba Szepesvári,et al. Finite time bounds for sampling based fitted value iteration , 2005, ICML.
[111] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[112] Nikos A. Vlassis,et al. Utile Coordination: Learning Interdependencies Among Cooperative Agents , 2005, CIG.
[113] Robert Fitch,et al. Structural Abstraction Experiments in Reinforcement Learning , 2005, Australian Conference on Artificial Intelligence.
[114] Karl Tuyls,et al. An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games , 2005, Autonomous Agents and Multi-Agent Systems.
[115] E.H.J. Nijhuis,et al. Cooperative multi-agent reinforcement learning of traffic lights , 2005 .
[116] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[117] Nikos A. Vlassis,et al. Non-communicative multi-robot coordination in dynamic environments , 2005, Robotics Auton. Syst..
[118] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[119] Shin Ishii,et al. A Reinforcement Learning Scheme for a Partially-Observable Multi-Agent Game , 2005, Machine Learning.
[120] Ann Nowé,et al. Evolutionary game theory and multi-agent reinforcement learning , 2005, The Knowledge Engineering Review.
[121] Nikos A. Vlassis,et al. Using the Max-Plus Algorithm for Multiagent Decision Making in Coordination Graphs , 2005, BNAIC.
[122] Bart De Schutter,et al. Multiagent Reinforcement Learning with Adaptive State Focus , 2005, BNAIC.
[123] R. Paul Wiegand,et al. Biasing Coevolutionary Search for Optimal Multiagent Behaviors , 2006, IEEE Transactions on Evolutionary Computation.
[124] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[125] Bart De Schutter,et al. Decentralized Reinforcement Learning Control of a Robotic Manipulator , 2006, 2006 9th International Conference on Control, Automation, Robotics and Vision.
[126] Vincent Conitzer,et al. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.
[127] Olivier Buffet,et al. Shaping multi-agent systems with gradient reinforcement learning , 2007, Autonomous Agents and Multi-Agent Systems.
[128] Shin Ishii,et al. Multiagent reinforcement learning applied to a chase problem in a continuous world , 2001, Artificial Life and Robotics.
[129] Sridhar Mahadevan,et al. Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.
[130] Colin R. Reeves,et al. Evolutionary computation: a unified approach , 2007, Genetic Programming and Evolvable Machines.
[131] Rémi Munos,et al. Performance Bounds in Lp-norm for Approximate Value Iteration , 2007, SIAM J. Control. Optim..
[132] De,et al. Relational Reinforcement Learning , 2001, Encyclopedia of Machine Learning and Data Mining.