Multi-agent Reinforcement Learning: An Overview
暂无分享,去创建一个
[1] Olivier Buffet,et al. Shaping multi-agent systems with gradient reinforcement learning , 2007, Autonomous Agents and Multi-Agent Systems.
[2] Robert A. Lordo,et al. Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.
[3] William T. B. Uther,et al. Adversarial Reinforcement Learning , 2003 .
[4] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[5] R. Paul Wiegand,et al. Biasing Coevolutionary Search for Optimal Multiagent Behaviors , 2006, IEEE Transactions on Evolutionary Computation.
[6] Yishay Mansour,et al. Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.
[7] Klaus Debes,et al. A reinforcement learning based neural multiagent system for control of a combustion process , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[8] Hamid R. Berenji,et al. A convergent actor-critic-based FRL algorithm with application to power management of wireless transmitters , 2003, IEEE Trans. Fuzzy Syst..
[9] Manuela Veloso,et al. An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning , 2000 .
[10] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control 3rd Edition, Volume II , 2010 .
[11] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[12] Claude F. Touzet,et al. Neural reinforcement learning for behaviour synthesis , 1997, Robotics Auton. Syst..
[13] Martin A. Riedmiller,et al. Karlsruhe Brainstormers - A Reinforcement Learning Approach to Robotic Soccer , 2000, RoboCup.
[14] Ralph Arnote,et al. Hong Kong (China) , 1996, OECD/G20 Base Erosion and Profit Shifting Project.
[15] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[16] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[17] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[18] Ann Nowé,et al. Evolutionary game theory and multi-agent reinforcement learning , 2005, The Knowledge Engineering Review.
[19] Lionel Jouffe,et al. Fuzzy inference system learning by reinforcement methods , 1998, IEEE Trans. Syst. Man Cybern. Part C.
[20] Makoto Yokoo,et al. Intelligent Agents: Specification, Modeling, and Applications , 2001, Lecture Notes in Computer Science.
[21] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .
[22] Bart De Schutter,et al. Multiagent Reinforcement Learning with Adaptive State Focus , 2005, BNAIC.
[23] Sandip Sen,et al. Strongly Typed Genetic Programming in Evolving Cooperation Strategies , 1995, ICGA.
[24] Ville Könönen,et al. Asymmetric multiagent reinforcement learning , 2003, Web Intell. Agent Syst..
[25] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[26] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[27] Sridhar Mahadevan,et al. Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.
[28] Vivek S. Borkar,et al. An actor-critic algorithm for constrained Markov decision processes , 2005, Syst. Control. Lett..
[29] Karl Tuyls,et al. An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games , 2005, Autonomous Agents and Multi-Agent Systems.
[30] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[31] Dit-Yan Yeung,et al. Predictive Q-Routing: A Memory-based Reinforcement Learning Approach to Adaptive Traffic Control , 1995, NIPS.
[32] Moshe Tennenholtz,et al. Adaptive Load Balancing: A Study in Multi-Agent Learning , 1994, J. Artif. Intell. Res..
[33] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.
[34] Shin Ishii,et al. A Reinforcement Learning Scheme for a Partially-Observable Multi-Agent Game , 2005, Machine Learning.
[35] Milind Tambe,et al. Intelligent Agents VIII , 2002, Lecture Notes in Computer Science.
[36] J M Smith,et al. Evolution and the theory of games , 1976 .
[37] Yukinori Kakazu,et al. An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning , 2003, Robotics Auton. Syst..
[38] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[39] Bart De Schutter,et al. Approximate Dynamic Programming and Reinforcement Learning , 2010, Interactive Collaborative Information Systems.
[40] Bikramjit Banerjee,et al. Adaptive policy gradient in multiagent learning , 2003, AAMAS '03.
[41] Michael L. Littman,et al. Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.
[42] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[43] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[44] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.
[45] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[46] Marco Wiering,et al. Multi-Agent Reinforcement Learning for Traffic Light control , 2000 .
[47] Warren B. Powell,et al. Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics) , 2007 .
[48] Craig Boutilier,et al. Implicit Imitation in Multiagent Reinforcement Learning , 1999, ICML.
[49] Jordan B. Pollack,et al. A Game-Theoretic Approach to the Simple Coevolutionary Algorithm , 2000, PPSN.
[50] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[51] R. Paul Wiegand,et al. Improving Coevolutionary Search for Optimal Multiagent Behaviors , 2003, IJCAI.
[52] Y. Narahari,et al. Reinforcement learning applications in dynamic pricing of retail markets , 2003, EEE International Conference on E-Commerce, 2003. CEC 2003..
[53] Georgios Chalkiadakis. Multiagent reinforcement learning: stochastic games with multiple learning players , 2003 .
[54] David Carmel,et al. Opponent Modeling in Multi-Agent Systems , 1995, Adaption and Learning in Multi-Agent Systems.
[55] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[56] Daniel Kudenko,et al. Adaptive Agents and Multi-Agent Systems , 2003, Lecture Notes in Computer Science.
[57] Reda Alhajj,et al. Multiagent reinforcement learning using function approximation , 2000, IEEE Trans. Syst. Man Cybern. Part C.
[58] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .
[59] Yoav Shoham,et al. If multi-agent learning is the answer, what is the question? , 2007, Artif. Intell..
[60] Pierre Yves Glorennec,et al. Reinforcement Learning: an Overview , 2000 .
[61] L. Buşoniu,et al. A comprehensive survey of multi-agent reinforcement learning , 2011 .
[62] Corso Elvezia. A General Method for Incremental Self-improvement and Multi-agent Learning in Unrestricted Environments , 1996 .
[63] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[64] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[65] Q. Henry Wu,et al. Multi-agent learning for routing control within an Internet environment , 2004, Eng. Appl. Artif. Intell..
[66] Yoav Shoham,et al. New Criteria and a New Algorithm for Learning in Multi-Agent Systems , 2004, NIPS.
[67] Kenneth A. De Jong,et al. A Cooperative Coevolutionary Approach to Function Optimization , 1994, PPSN.
[68] Bernard Manderick,et al. Q-Learning in Simulated Robotic Soccer - Large State Spaces and Incomplete Information , 2002, ICMLA.
[69] Wenfei Fan,et al. Keys with Upward Wildcards for XML , 2001, DEXA.
[70] Mohamed S. Kamel,et al. Learning Coordination Strategies for Cooperative Multiagent Systems , 1998, Machine Learning.
[71] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[72] 採編典藏組. Society for Industrial and Applied Mathematics(SIAM) , 2008 .
[73] Manuela Veloso,et al. Multiagent learning in the presence of agents with limitations , 2003 .
[74] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.
[75] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.
[76] Guillermo Ricardo Simari,et al. Multiagent systems: a modern approach to distributed artificial intelligence , 2000 .
[77] Reinhard Männer,et al. Parallel Problem Solving from Nature — PPSN III , 1994, Lecture Notes in Computer Science.
[78] Felix A. Fischer,et al. Hierarchical reinforcement learning in communication-mediated multiagent coordination , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[79] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[80] Martin Zinkevich,et al. Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.
[81] Bart De Schutter,et al. Multi-agent model predictive control for transportation networks: Serial versus parallel schemes , 2008, Eng. Appl. Artif. Intell..
[82] Claude F. Touzet,et al. Robot Awareness in Cooperative Mobile Robot Learning , 2000, Auton. Robots.
[83] Xiaofeng Wang,et al. Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.
[84] Jürgen Schmidhuber,et al. Reinforcement Learning Soccer Teams with Incomplete World Models , 1999, Auton. Robots.
[85] Nikos A. Vlassis,et al. Sparse cooperative Q-learning , 2004, ICML.
[86] Michael H. Bowling,et al. Convergence and No-Regret in Multiagent Learning , 2004, NIPS.
[87] T. Jung,et al. Kernelizing LSPE(λ) , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[88] Vincent Conitzer,et al. AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents , 2003, Machine Learning.
[89] Sandip Sen,et al. Learning to Coordinate without Sharing Information , 1994, AAAI.
[90] Nikos A. Vlassis,et al. Utile Coordination: Learning Interdependencies Among Cooperative Agents , 2005, CIG.
[91] Ville Könönen,et al. Gradient Based Method for Symmetric and Asymmetric Multiagent Reinforcement Learning , 2003, IDEAL.
[92] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[93] E.H.J. Nijhuis,et al. Cooperative multi-agent reinforcement learning of traffic lights , 2005 .
[94] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[95] T. Horiuchi,et al. Fuzzy interpolation-based Q-learning with continuous states and actions , 1996, Proceedings of IEEE 5th International Fuzzy Systems.
[96] Manuela M. Veloso,et al. Team-partitioned, opaque-transition reinforcement learning , 1999, AGENTS '99.
[97] Nikos Vlassis,et al. A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence I Mobk077-fm Synthesis Lectures on Artificial Intelligence and Machine Learning a Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence a Concise Introduction to Multiagent Systems and D , 2007 .
[98] Sandip Sen,et al. Learning in multiagent systems , 1999 .
[99] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[100] William D. Smart,et al. Interpolation-based Q-learning , 2004, ICML.
[101] Xin Yao,et al. Parallel Problem Solving from Nature PPSN VI , 2000, Lecture Notes in Computer Science.
[102] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[103] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[104] Michael P. Wellman,et al. The 2001 trading agent competition , 2002, Electron. Mark..
[105] Geoffrey E. Hinton,et al. Unsupervised learning : foundations of neural computation , 1999 .
[106] Michael H. Bowling,et al. Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.
[107] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[108] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[109] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[110] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[111] Julie A. Adams,et al. Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence , 2001, AI Mag..
[112] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[113] Colin R. Reeves,et al. Evolutionary computation: a unified approach , 2007, Genetic Programming and Evolvable Machines.
[114] Daniel Kudenko,et al. Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.
[115] Milind Tambe,et al. The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models , 2011, J. Artif. Intell. Res..
[116] Victor R. Lesser,et al. Learning organizational roles for negotiated search in a multiagent system , 1998, Int. J. Hum. Comput. Stud..
[117] Maja J. Mataric,et al. Learning in Multi-Robot Systems , 1995, Adaption and Learning in Multi-Agent Systems.
[118] Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2009 .
[119] Jeffrey O. Kephart,et al. Pricing in Agent Economies Using Multi-Agent Q-Learning , 2002, Autonomous Agents and Multi-Agent Systems.
[120] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[121] David G. Luenberger,et al. Linear and nonlinear programming , 1984 .
[122] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.
[123] Andrew W. Moore,et al. Reinforcement Learning for Cooperating and Communicating Reactive Agents in Electrical Power Grids , 2000, Balancing Reactivity and Social Deliberation in Multi-Agent Systems.
[124] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[125] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[126] Gunes Ercal,et al. On No-Regret Learning, Fictitious Play, and Nash Equilibrium , 2001, ICML.
[127] Jae Won Lee,et al. A Multi-agent Q-learning Framework for Optimizing Stock Trading Systems , 2002, DEXA.
[128] Bart De Schutter,et al. Decentralized Reinforcement Learning Control of a Robotic Manipulator , 2006, 2006 9th International Conference on Control, Automation, Robotics and Vision.
[129] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[130] Akira Hayashi,et al. A multiagent reinforcement learning algorithm using extended optimal response , 2002, AAMAS '02.
[131] H. Van Dyke Parunak,et al. Industrial and practical applications of DAI , 1999 .
[132] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[133] Manuela M. Veloso,et al. Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.
[134] Andriy Zapechelnyuk. Limit Behavior of No-regret Dynamics , 2009 .
[135] C. Boutilier,et al. Accelerating Reinforcement Learning through Implicit Imitation , 2003, J. Artif. Intell. Res..
[136] Panos M. Pardalos,et al. Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..
[137] Enrico Pagello,et al. Balancing Reactivity and Social Deliberation in Multi-Agent Systems , 2001, Lecture Notes in Computer Science.
[138] DeLiang Wang,et al. Unsupervised Learning: Foundations of Neural Computation , 2001, AI Mag..
[139] Thomas Bäck,et al. Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .
[140] T. Başar,et al. Dynamic Noncooperative Game Theory, 2nd Edition , 1998 .
[141] Shichao Zhang,et al. AI 2005: Advances in Artificial Intelligence, 18th Australian Joint Conference on Artificial Intelligence, Sydney, Australia, December 5-9, 2005, Proceedings , 2005, Australian Conference on Artificial Intelligence.
[142] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[143] Matthijs T. J. Spaan,et al. High level coordination of agents based on multiagent Markov decision processes with roles , 2002 .
[144] Von-Wun Soo,et al. Market Performance of Adaptive Trading Agents in Synchronous Double Auctions , 2001, PRIMA.
[145] Dimitri P. Bertsekas,et al. Least Squares Policy Evaluation Algorithms with Linear Function Approximation , 2003, Discret. Event Dyn. Syst..
[146] Sean P. Meyn,et al. An analysis of reinforcement learning with function approximation , 2008, ICML '08.
[147] José M. Vidal,et al. Learning in Multiagent Systems: An Introduction from a Game-Theoretic Perspective , 2003, Adaptive Agents and Multi-Agents Systems.
[148] Peter Stone,et al. Implicit Negotiation in Repeated Games , 2001, ATAL.
[149] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[150] Toshiyuki Sueyoshi,et al. An agent-based decision support system for wholesale electricity market , 2008, Decis. Support Syst..
[151] Jeffrey S. Rosenschein,et al. Best-response multiagent learning in non-stationary environments , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[152] Nikos A. Vlassis,et al. Non-communicative multi-robot coordination in dynamic environments , 2005, Robotics Auton. Syst..
[153] Robert Fitch,et al. Structural Abstraction Experiments in Reinforcement Learning , 2005, Australian Conference on Artificial Intelligence.
[154] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[155] Jürgen Schmidhuber,et al. Learning Team Strategies: Soccer Case Studies , 1998, Machine Learning.
[156] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[157] Andrew G. Barto,et al. Elevator Group Control Using Multiple Reinforcement Learning Agents , 1998, Machine Learning.
[158] Gerald Tesauro,et al. Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.
[159] Frans C. A. Groen,et al. Interactive Collaborative Information Systems , 2012, Interactive Collaborative Information Systems.
[160] Shin Ishii,et al. Multiagent reinforcement learning applied to a chase problem in a continuous world , 2001, Artificial Life and Robotics.