Multiagent Learning: Basics, Challenges, and Prospects

Multiagent systems (MAS) are widely accepted as an important method for solving problems of a distributed nature. A key to the success of MAS is efficient and effective multiagent learning (MAL). The past twenty-five years have seen a great interest and tremendous progress in the field of MAL. This article introduces and overviews this field by presenting its fundamentals, sketching its historical development and describing some key algorithms for MAL. Moreover, main challenges that the field is facing today are indentified.

[1]  L Shastri,et al.  Massive parallelism in artificial intelligence. , 1987, Applied optics.

[2]  Sandip Sen,et al.  Adaption and Learning in Multi-Agent Systems , 1995, Lecture Notes in Computer Science.

[3]  Kagan Tumer,et al.  Learning sequences of actions in collectives of autonomous agents , 2002, AAMAS '02.

[4]  Yoav Shoham,et al.  If multi-agent learning is the answer, what is the question? , 2007, Artif. Intell..

[5]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[6]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[7]  Matthew E. Taylor,et al.  Common Subspace Transfer for Reinforcement Learning Tasks , 2011 .

[8]  Barbara Webb,et al.  Swarm Intelligence: From Natural to Artificial Systems , 2002, Connect. Sci..

[9]  J. Galef • IMITATION IN ANIMALS: HISTORY, DEFINITION, AND INTERPRETATION OF DATA FROM THE PSYCHOLOGICAL LABORATORY , 2013 .

[10]  Anne S. Goodsell Collaborative Learning: A Sourcebook for Higher Education. , 1992 .

[11]  International Foundation for Autonomous Agents and MultiAgent Systems ( IFAAMAS ) , 2007 .

[12]  Steven D. Whitehead,et al.  A Complexity Analysis of Cooperative Mechanisms in Reinforcement Learning , 1991, AAAI.

[13]  Ben Niu,et al.  A Swarm-Based Learning Method Inspired by Social Insects , 2007, ICIC.

[14]  Victor R. Lesser,et al.  A Multiagent Reinforcement Learning Algorithm with Non-linear Dynamics , 2008, J. Artif. Intell. Res..

[15]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[16]  Prasad Tadepalli,et al.  Multiagent Transfer Learning via Assignment-Based Decomposition , 2009, 2009 International Conference on Machine Learning and Applications.

[17]  Peter Stone,et al.  Half Field Offense in RoboCup Soccer: A Multiagent Reinforcement Learning Case Study , 2006, RoboCup.

[18]  Kagan Tumer,et al.  A multiagent approach to managing air traffic flow , 2010, Autonomous Agents and Multi-Agent Systems.

[19]  Karl Tuyls,et al.  Stigmergic landmark foraging , 2009, AAMAS.

[20]  Pieter Abbeel,et al.  An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[21]  Foster J. Provost,et al.  A Survey of Methods for Scaling Up Inductive Algorithms , 1999, Data Mining and Knowledge Discovery.

[22]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[23]  Kagan Tumer,et al.  Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..

[24]  Simon Parsons,et al.  What evolutionary game theory tells us about multiagent learning , 2007, Artif. Intell..

[25]  Michael L. Littman,et al.  Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.

[26]  Gerhard Weiß,et al.  Adaptation and Learning in Multi-Agent Systems: Some Remarks and a Bibliography , 1995, Adaption and Learning in Multi-Agent Systems.

[27]  Peter Stone,et al.  Convergence, Targeted Optimality, and Safety in Multiagent Learning , 2010, ICML.

[28]  Sandip Sen,et al.  Learning Cases to Compliment Rules for Conflict Resolution in Multiagent Systems , 1996 .

[29]  Paul D. Scott Distributed Artificial Intelligence Meets Machine Learning: Learning in Multi-Agent Environments Edited by Gerhard Weiss , 2000, J. Artif. Soc. Soc. Simul..

[30]  Jan Paredis,et al.  Coevolutionary Computation , 1995, Artificial Life.

[31]  M. Bacharach Economics and the Theory of Games , 2019 .

[32]  Marco Dorigo,et al.  Distributed Optimization by Ant Colonies , 1992 .

[33]  Gerhard Weiss,et al.  Learning to Coordinate Actions in Multi-Agent-Systems , 1993, IJCAI.

[34]  Sandip Sen,et al.  Learning and Adaptation in Multi-Agent Systems , 2006 .

[35]  Karl Tuyls,et al.  Theoretical Advantages of Lenient Learners: An Evolutionary Game Theoretic Perspective , 2008, J. Mach. Learn. Res..

[36]  Andrea Lockerd Thomaz,et al.  Using perspective taking to learn from ambiguous demonstrations , 2006, Robotics Auton. Syst..

[37]  Michael H. Bowling,et al.  Convergence and No-Regret in Multiagent Learning , 2004, NIPS.

[38]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[39]  C. Boutilier,et al.  Accelerating Reinforcement Learning through Implicit Imitation , 2003, J. Artif. Intell. Res..

[40]  V. Sánchez Connectionism in perspective , 1991 .

[41]  Bernard Manderick,et al.  Fine-Grained Parallel Genetic Algorithms , 1989, ICGA.

[42]  Karl Tuyls,et al.  An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games , 2005, Autonomous Agents and Multi-Agent Systems.

[43]  Sandip Sen,et al.  Evolution and learning in multiagent systems , 1998, Int. J. Hum. Comput. Stud..

[44]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[45]  Svetha Venkatesh,et al.  Learning other Agents' Preferences in Multiagent Negotiation , 1996, AAAI/IAAI, Vol. 1.

[46]  Ming Tan,et al.  Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[47]  Yishay Mansour,et al.  Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.

[48]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[49]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[50]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[51]  Peter Stone,et al.  An Introduction to Intertask Transfer for Reinforcement Learning , 2011, AI Mag..

[52]  Manuel López-Ibáñez,et al.  Ant colony optimization , 2010, GECCO '10.

[53]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[54]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[55]  Sandip Sen,et al.  Learning in multiagent systems , 1999 .

[56]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[57]  Han Xue-don Parallel reinforcement learning algorithm and its application , 2009 .

[58]  Ann Nowé,et al.  Evolutionary game theory and multi-agent reinforcement learning , 2005, The Knowledge Engineering Review.

[59]  Victor Lesser,et al.  Learning Organizational Roles in a Heterogeneous Multi-agent System , 1995 .

[60]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[61]  Jin Yu,et al.  Natural Actor-Critic for Road Traffic Optimisation , 2006, NIPS.

[62]  Santiago Ontañón,et al.  Multiagent Inductive Learning: an Argumentation-based Approach , 2010, ICML.

[63]  Gerhard Weiss,et al.  Bee-inspired foraging in an embodied swarm , 2011, AAMAS.

[64]  M. Nowak,et al.  Evolutionary game theory , 1995, Current Biology.

[65]  E. Durfee,et al.  The Impact of Nested Agent Models in an Information Economy , 1996 .

[66]  Peter Stone,et al.  Multiagent learning is not the answer. It is the question , 2007, Artif. Intell..

[67]  Thomas H. Parker,et al.  What is π , 1991 .

[68]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[69]  Alan Fern,et al.  Bayesian role discovery for multi-agent reinforcement learning , 2010, AAMAS.

[70]  E. Paice,et al.  Collaborative learning , 2003, Medical education.

[71]  Jörgen W. Weibull,et al.  Evolutionary Game Theory , 1996 .

[72]  A. Banerjee,et al.  A Simple Model of Herd Behavior , 1992 .

[73]  J. Pollack,et al.  Challenges in coevolutionary learning: arms-race dynamics, open-endedness, and medicocre stable states , 1998 .

[74]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[75]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[76]  Kagan Tumer,et al.  Analyzing and visualizing multiagent rewards in dynamic and stochastic domains , 2008, Autonomous Agents and Multi-Agent Systems.

[77]  Trevor Darrell,et al.  Multi-View Learning in the Presence of View Disagreement , 2008, UAI 2008.

[78]  L. Steels Self-organising vocabularies , 1996 .

[79]  Karl Tuyls,et al.  An Overview of Cooperative and Competitive Multiagent Learning , 2005, LAMAS.

[80]  C. Lumsden Culture and the Evolutionary Process, Robert Boyd, Peter J. Richerson. University of Chicago Press, Chicago & London (1985), viii, +301. Price $29.95 , 1986 .

[81]  Michael P. Wellman,et al.  Experimental Results on Q-Learning for General-Sum Stochastic Games , 2000, ICML.