Cooperative Multi-Agent Learning: The State of the Art

Cooperative multi-agent systems (MAS) are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multi-agent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to MAS problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multi-agent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning, RL or robotics). In this survey we attempt to draw from multi-agent learning work in a spectrum of areas, including RL, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multi-agent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multi-agent learning problem domains, and a list of multi-agent learning resources.

[1]  R. Matthews,et al.  Ants. , 1898, Science.

[2]  Keith B. Hall,et al.  Fair and Efficient Solutions to the Santa Fe Bar Problem , 1910 .

[3]  Karl Ernst Osthaus Van de Velde , 1920 .

[4]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[5]  K. Dejong,et al.  An analysis of the behavior of a class of genetic adaptive systems , 1975 .

[6]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[7]  Kenneth Alan De Jong,et al.  An analysis of the behavior of a class of genetic adaptive systems. , 1975 .

[8]  G. Holton Sociobiology: the new synthesis? , 1977, Newsletter on science, technology & human values.

[9]  W. Hamilton,et al.  The evolution of cooperation. , 1984, Science.

[10]  Frederick Hayes-Roth,et al.  Distributed Intelligence for Air Fleet Control , 1981 .

[11]  M. Benda,et al.  On Optimal Cooperation of Knowledge Sources , 1985 .

[12]  John H. Holland,et al.  Properties of the Bucket Brigade , 1985, ICGA.

[13]  Michael N. Huhns,et al.  An intelligent system for document retrieval in distributed office environments , 1986, J. Am. Soc. Inf. Sci..

[14]  Michael N. Huhns,et al.  An intelligent system for document retrieval in distributed office environments , 1986 .

[15]  Craig W. Reynolds Flocks, herds, and schools: a distributed behavioral model , 1987, SIGGRAPH.

[16]  Edmund H. Durfee,et al.  Coherent Cooperation Among Communicating Problem Solvers , 1987, IEEE Transactions on Computers.

[17]  Edmund H. Durfee,et al.  An Update on the Distributed Vehicle Monitoring Testbed , 1987 .

[18]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[19]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[20]  Robert B. Wesson,et al.  Architectures for distributed air-traffic control , 1988 .

[21]  Edmund H. Durfee,et al.  Trends in Cooperative Distributed Problem Solving , 1989, IEEE Trans. Knowl. Data Eng..

[22]  A. Barto,et al.  Learning and Sequential Decision Making , 1989 .

[23]  Edmund H. Durfee,et al.  Evaluating Research in Cooperative Distributed Problem Solving , 1990, Distributed Artificial Intelligence.

[24]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[25]  W. Daniel Hillis,et al.  Co-evolving parasites improve simulated evolution as an optimization procedure , 1990 .

[26]  David R. Jefferson,et al.  An Artificial Neural Network Representation for Artificial Organisms , 1990, PPSN.

[27]  Luís Torgo,et al.  Panel: Learning in Distributed Systems and Multi-Agent Environments , 1991, EWSL.

[28]  Jean-Louis Deneubourg,et al.  The dynamics of collective sorting robot-like ants and ant-like robots , 1991 .

[29]  Charles E. Taylor,et al.  Artificial Life II , 1991 .

[30]  John J. Grefenstette,et al.  Lamarckian Learning in Multi-Agent Environments , 1991, ICGA.

[31]  P. Gmytrasiewicz A decision-theoretic model of coordination and communication in autonomous systems , 1992 .

[32]  Edmund H. Durfee,et al.  What Your Computer Really Needs to Know, You Learned in Kindergarten , 1992, AAAI.

[33]  Sridhar Mahadevan,et al.  Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..

[34]  K. Fischer,et al.  Sophisticated and distributed: The transportation domain , 1993, Proceedings of 9th IEEE Conference on Artificial Intelligence for Applications.

[35]  Michael L. Littman,et al.  A Distributed Reinforcement Learning Scheme for Network Routing , 1993 .

[36]  Peter J. Angeline,et al.  Competitive Environments Evolve Better Solutions for Complex Tasks , 1993, ICGA.

[37]  Mark R. Cutkosky,et al.  PACT: an experiment in integrating concurrent engineering systems , 1993, Computer.

[38]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[39]  Ming Tan,et al.  Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[40]  Holly A. Yanco,et al.  An adaptive communication protocol for cooperating mobile robots , 1993 .

[41]  Craig W. Reynolds An evolved, vision-based behavioral model of coordinated group motion , 1993 .

[42]  Nicholas R. Jennings,et al.  Transforming standalone expert systems into a community of cooperating agents , 1993 .

[43]  Michael R. M. Jenkin,et al.  A taxonomy for swarm robots , 1993, Proceedings of 1993 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '93).

[44]  Ursula M. Schwuttke,et al.  Enhancing Performance of Cooperating Agents in Real-Time Diagnostic Systems , 1993, IJCAI.

[45]  R. Bajcsy IJCAI-93 : proceedings of the Thirteenth International Joint Conference on Artificial Intelligence , Chambéry, France, August 28-September 3, 1993 , 1993 .

[46]  Michael L. Littman,et al.  Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.

[47]  Michael G. Dyer,et al.  Evolution of herding behavior in artificial animals , 1993 .

[48]  James P. Crutchfield,et al.  A Genetic Algorithm Discovers Particle-Based Computation in Cellular Automata , 1994, PPSN.

[49]  Nicholas R. Jennings,et al.  Integrating Intelligent Systems into a Cooperating Community for Electricity Distribution Management , 1994 .

[50]  Larry Bull,et al.  Evolving cooperative communicating classifier systems , 1994 .

[51]  Kenneth A. De Jong,et al.  A Cooperative Coevolutionary Approach to Function Optimization , 1994, PPSN.

[52]  Maja J. Matarić,et al.  Leaning to behave socially , 1994 .

[53]  Sebastian Thrun,et al.  Learning to Play the Game of Chess , 1994, NIPS.

[54]  Maja J. Mataric,et al.  Reward Functions for Accelerated Learning , 1994, ICML.

[55]  David H. Ackley,et al.  Altruism in the evolution of communication , 1994 .

[56]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[57]  Maja J. Mataric,et al.  Interaction and intelligent behavior , 1994 .

[58]  Jörg P. Müller,et al.  An Architecture for Dynamically Interacting Agents , 1994, Int. J. Cooperative Inf. Syst..

[59]  W. Arthur Inductive Reasoning and Bounded Rationality , 1994 .

[60]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[61]  B. Huberman,et al.  THE DYNAMICS OF SOCIAL DILEMMAS , 1994 .

[62]  M. Matarić Learning to Behave Socially , 1994 .

[63]  David Carmel,et al.  The M* Algorithm: Incorporating Opponent Models into Adversary Search , 1994 .

[64]  Gerhard Weiss,et al.  Some Studies in Distributed Machine Learning and Organizational Design , 1994 .

[65]  Craig W. Reynolds Competition, Coevolution and the Game of Tag , 1994 .

[66]  John Fox,et al.  An Agent Architecture for Distributed Medical Care , 1995, ECAI Workshop on Agent Theories, Architectures, and Languages.

[67]  Lawrence J. Fogel,et al.  Evolutionary Programming: Proceedings of the Third Annual Conference , 1994 .

[68]  Hugo Velthuijsen,et al.  Application of Distributed AI and Cooperative Problem Solving to Telecommunications , 1994 .

[69]  John J. Grefenstette,et al.  A Coevolutionary Approach to Learning Sequential Decision Rules , 1995, ICGA.

[70]  Luc Steels,et al.  A Self-Organizing Spatial Vocabulary , 1995, Artificial Life.

[71]  Claudia V. Goldman,et al.  Mutually Supervised Learning in Multiagent Systems , 1995, Adaption and Learning in Multi-Agent Systems.

[72]  Simon R. Schultz,et al.  A Reinforcement Learning Exploration Strategy based on Ant Foraging Mechanisms , 1995 .

[73]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[74]  Gerhard Weiß Distributed Machine Learning , 1995, DISKI.

[75]  Mahendra Sekaran,et al.  To help or not to help , 1995 .

[76]  Dave Cliff,et al.  Tracking the Red Queen: Measurements of Adaptive Progress in Co-Evolutionary Simulations , 1995, ECAL.

[77]  Sandip Sen,et al.  Strongly Typed Genetic Programming in Evolving Cooperation Strategies , 1995, ICGA.

[78]  Milind Tambe Recursive Agent and Agent-Group Tracking in a Real-Time Dynamic Environment , 1995, ICMAS.

[79]  Luc Steels The Spontaneous Self-organization of an Adaptive Language , 1995, Machine Intelligence 15.

[80]  Sandip Sen,et al.  Multiagent Coordination with Learning Classifier Systems , 1995, Adaption and Learning in Multi-Agent Systems.

[81]  Sandip Sen,et al.  Evolving a Team , 1995 .

[82]  Nicholas R. Jennings,et al.  Intelligent agents: theory and practice , 1995, The Knowledge Engineering Review.

[83]  Sandip Sen,et al.  Using Reciprocity to Adapt to Others , 1995, Adaption and Learning in Multi-Agent Systems.

[84]  Eizo Akiyama,et al.  Evolution of Cooperation, Differentiation, Complexity, and Diversity in an Iterated Three-Person Game , 1995, Artificial Life.

[85]  Sandip Sen,et al.  Evolving Beharioral Strategies in Predators and Prey , 1995, Adaption and Learning in Multi-Agent Systems.

[86]  Martin Nilsson,et al.  Cooperative multi-robot box-pushing , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[87]  Sandip Sen,et al.  Evolving Multiagent Coordination Strategies with Genetic Programming , 1995 .

[88]  Tuomas Sandholm,et al.  On Multiagent Q-Learning in a Semi-Competitive Domain , 1995, Adaption and Learning in Multi-Agent Systems.

[89]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[90]  Pattie Maes,et al.  Incremental Self-Improvement for Life-Time Multi-Agent Reinforcement Learning , 1996 .

[91]  F. H. Bennett,et al.  Discovery by genetic programming of a cellular automata rule that is better than any known rule for the majority classification problem , 1996 .

[92]  Jürgen Schmidhuber,et al.  Multi-Agent Learning with the Success-Story Algorithm , 1996, ECAI Workshop LDAIS / ICMAS Workshop LIOME.

[93]  Sandip Sen,et al.  Cooperation of the Fittest , 1996 .

[94]  Stewart W. Wilson,et al.  Not) Evolving Collective Behaviours in Synthetic Fish , 1996 .

[95]  Yuichiro Anzai,et al.  Addressee Learning and Message Interception for Communication Load Reduction in Multiple Robot Environments , 1996, ECAI Workshop LDAIS / ICMAS Workshop LIOME.

[96]  M. Lichbach The cooperator's dilemma , 1996 .

[97]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[98]  Hewlett-Packard Laboratories Bristol,et al.  (Not) Evolving Collective Behaviours in Synthetic Fish , 1996 .

[99]  Lee Spector,et al.  Evolving teamwork and coordination with genetic programming , 1996 .

[100]  Sandip Sen IJCAI-95 Workshop on Adaptation and Learning in Multiagent Systems , 1996 .

[101]  Thomas Bäck,et al.  Evolutionary Algorithms in Theory and Practice , 1996 .

[102]  L. Steels Self-organising vocabularies , 1996 .

[103]  Pattie Maes,et al.  The Evolution of Communication Schemes Over Continuous Channels , 1996 .

[104]  Andrew G. Barto,et al.  Large-scale dynamic optimization using teams of reinforcement learning agents , 1996 .

[105]  Jordan B. Pollack,et al.  Coevolution of a Backgammon Player , 1996 .

[106]  Pattie Maes,et al.  Emergent Adaptive Lexicons , 1996 .

[107]  Thomas Bäck,et al.  Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .

[108]  Michael Wooldridge,et al.  Production Sequencing as Negotiation , 1996, PAAM.

[109]  Sandip Sen,et al.  Learning Cases to Compliment Rules for Conflict Resolution in Multiagent Systems , 1996 .

[110]  Corso Elvezia Realistic Multi-agent Reinforcement Learning , 1996 .

[111]  Juergen Schmidhuber,et al.  Incremental self-improvement for life-time multi-agent reinforcement learning , 1996 .

[112]  H. Van Dyke Parunak,et al.  Applications of distributed artificial intelligence in industry , 1996 .

[113]  Junling Hu,et al.  Self-fulfilling Bias in Multiagent Learning , 1996 .

[114]  Nicholas R. Jennings,et al.  Foundations of Distributed AI , 1996 .

[115]  Zbigniew Michalewicz,et al.  Genetic algorithms + data structures = evolution programs (3rd ed.) , 1996 .

[116]  Hitoshi Iba Emergent Cooperation for Multiple Agents Using Genetic Programming , 1996, PPSN.

[117]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[118]  Craig Boutilier,et al.  Learning Conventions in Multiagent Stochastic Domains using Likelihood Estimates , 1996, UAI.

[119]  B. Russel,et al.  Generating Vowel Systems in a Population of Agents , 1997 .

[120]  T. Cormen,et al.  Model-based Learning of Interaction Strategies in Multi-agent Systems , 1997 .

[121]  Sandip Sen,et al.  Evolving Cooperative Groups: Preliminary Results , 1997 .

[122]  James A. Hendler,et al.  Co-evolving Soccer Softbot Team Coordination with Genetic Programming , 1997, RoboCup.

[123]  Sandip Sen,et al.  Crossover Operators for Evolving A Team , 1997 .

[124]  Marco Wiering,et al.  Learning Team Strategies With Multiple Policy-Sharing Agents: A Soccer Case Study , 1997 .

[125]  Devika Subramanian,et al.  Ants and Reinforcement Learning: A Case Study in Routing in Dynamic Networks , 1997, IJCAI.

[126]  Andrew B. Kahng,et al.  Cooperative Mobile Robotics: Antecedents and Directions , 1997, Auton. Robots.

[127]  Maram V. Nagendraprasad,et al.  Learning situtation-specific control in multi-agent systems , 1997 .

[128]  Hiroaki Kitano,et al.  RoboCup: The Robot World Cup Initiative , 1997, AGENTS '97.

[129]  S. Sen Multiagent systems: Milestones and new horizons , 1997, Trends in Cognitive Sciences.

[130]  Mitchell A. Potter,et al.  The design and analysis of a computational model of cooperative coevolution , 1997 .

[131]  Inman Harvey,et al.  Evolutionary robotics: the Sussex approach , 1997, Robotics Auton. Syst..

[132]  Tucker Balch,et al.  Learning Roles: Behavioral Diversity in Robot Teams , 1997 .

[133]  Dan L. Grecu,et al.  Using Learning to Improve Multi-Agent Systems for Design , 1997 .

[134]  Svetha Venkatesh,et al.  A Framework for Coordination and Learning among Teams of Agents , 1997, Agents and Multi-Agent Systems Formalisms, Methodologies, and Applications.

[135]  Jürgen Schmidhuber,et al.  On Learning Soccer Strategies , 1997, ICANN.

[136]  Sandip Sen,et al.  Co-adaptation in a Team , 1997 .

[137]  Dave Cliff,et al.  Creatures: artificial life autonomous software agents for home entertainment , 1997, AGENTS '97.

[138]  Edmund H. Durfee,et al.  Agents Learning about Agents: A Framework and Analysis , 1997 .

[139]  Richard K. Belew,et al.  New Methods for Competitive Coevolution , 1997, Evolutionary Computation.

[140]  Larry Bull,et al.  Evolutionary computing in multi-agent environments: Partners , 1997 .

[141]  Maja J. Mataric,et al.  Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[142]  Peter Stone,et al.  Layered Learning in Multiagent Systems , 1997, AAAI/IAAI.

[143]  Gerhard Weiß Distributed Artificial Intelligence Meets Machine Learning Learning in Multi-Agent Environments , 1997, Lecture Notes in Computer Science.

[144]  Vidroha Debroy,et al.  Genetic Programming , 1998, Lecture Notes in Computer Science.

[145]  Luc Steels,et al.  Synthesising the origins of language and meaning using co-evolution, self-organisation and level formation , 1998 .

[146]  Sandip Sen,et al.  Individual learning of coordination knowledge , 1998, J. Exp. Theor. Artif. Intell..

[147]  Sandip Sen,et al.  Learning cases to resolve conflicts and improve group behavior , 1998, Int. J. Hum. Comput. Stud..

[148]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[149]  Michael P. Wellman,et al.  Online learning about other agents in a dynamic multiagent system , 1998, AGENTS '98.

[150]  Maja J. Mataric,et al.  Using communication to reduce locality in distributed multiagent learning , 1997, J. Exp. Theor. Artif. Intell..

[151]  R. Arkin,et al.  Behavioral diversity in learning robot teams , 1998 .

[152]  Franz Oppacher,et al.  ASGA: Improving the Ant System by Integration with Genetic Algorithms , 1998 .

[153]  Larry Bull,et al.  Evolutionary Computing in Multi-agent Environments: Operators , 1998, Evolutionary Programming.

[154]  Michael Luck,et al.  Foundations of Multi-Agent Systems: Techniques, Tools and Theory , 1998, The Knowledge Engineering Review.

[155]  R. Paul Wiegand,et al.  Applying Diffusion to a Cooperative Coevolutionary Model , 1998, PPSN.

[156]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[157]  J. Pollack,et al.  Coevolving the "Ideal" Trainer: Application to the Discovery of Cellular Automata Rules , 1998 .

[158]  Astro Teller,et al.  Evolving Team Darwin United , 1998, RoboCup.

[159]  Wilfried Brauer,et al.  Multi-machine scheduling-a multi-agent learning approach , 1998, Proceedings International Conference on Multi Agent Systems (Cat. No.98EX160).

[160]  J. Pollack,et al.  Challenges in coevolutionary learning: arms-race dynamics, open-endedness, and medicocre stable states , 1998 .

[161]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[162]  Sean Luke,et al.  Genetic Programming Produced Competitive Soccer Softbot Teams for RoboCup97 , 1998 .

[163]  Hitoshi Iba,et al.  Evolutionary Learning of Communicating Agents , 1998, Inf. Sci..

[164]  Edmund H. Durfee,et al.  The moving target function problem in multi-agent learning , 1998, Proceedings International Conference on Multi Agent Systems (Cat. No.98EX160).

[165]  Kagan Tumer,et al.  Using Collective Intelligence to Route Internet Traffic , 1998, NIPS.

[166]  K. Sigmund,et al.  Evolution of Indirect Reciprocity by Image Scoring/ The Dynamics of Indirect Reciprocity , 1998 .

[167]  Sandip Sen,et al.  Evolution and learning in multiagent systems , 1998, Int. J. Hum. Comput. Stud..

[168]  Sandip Sen,et al.  Shared memory based cooperative coevolution , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[169]  M. Nowak,et al.  Evolution of indirect reciprocity by image scoring , 1998, Nature.

[170]  M. Studdert-Kennedy,et al.  Approaches To The Evolution Of Language: Social And Cognitive Bases , 1998 .

[171]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[172]  Tucker Balch,et al.  Reward and Diversity in Multirobot Foraging , 1999, IJCAI 1999.

[173]  Akira Hara,et al.  Emergence of the cooperative behavior using ADG; Automatically Defined Groups , 1999, GECCO.

[174]  Pierre Dillenbourg,et al.  What is 'multi' in multi-agent learning? , 1999 .

[175]  Sandip Sen,et al.  Learning in multiagent systems , 1999 .

[176]  Luc Steels,et al.  Collective Learning and Semiotic Dynamics , 1999, ECAL.

[177]  Hitoshi Iba,et al.  Evolving multiple agents by genetic programming , 1999 .

[178]  Moshe Tennenholtz,et al.  Continuing research in multi-agent systems , 1999, Knowl. Eng. Rev..

[179]  Matthias Fuchs,et al.  Experiments in learning prototypical situations for variants of the pursuit game , 1999 .

[180]  Svetha Venkatesh,et al.  Learning Other Agents' Preferences in Multi-Agent Negotiation Using the Bayesian Classifier , 1999, Int. J. Cooperative Inf. Syst..

[181]  Lawrence J. Fogel,et al.  Intelligence Through Simulated Evolution: Forty Years of Evolutionary Programming , 1999 .

[182]  Victor R. Lesser,et al.  Cooperative Multiagent Systems: A Personal View of the State of the Art , 1999, IEEE Trans. Knowl. Data Eng..

[183]  Kagan Tumer,et al.  General principles of learning-based multi-agent systems , 1999, AGENTS '99.

[184]  Piotr J. Gmytrasiewicz,et al.  Learning models of other agents using influence diagrams , 1999 .

[185]  Andrew W. Moore,et al.  Distributed Value Functions , 1999, ICML.

[186]  Jürgen Schmidhuber,et al.  Reinforcement Learning Soccer Teams with Incomplete World Models , 1999, Auton. Robots.

[187]  Annie S. Wu,et al.  Evolving control for distributed micro air vehicles , 1999, Proceedings 1999 IEEE International Symposium on Computational Intelligence in Robotics and Automation. CIRA'99 (Cat. No.99EX375).

[188]  Gerhard Weiss,et al.  Multiagent systems: a modern approach to distributed artificial intelligence , 1999 .

[189]  Maja J. Matarić,et al.  Exploiting Embodiment in Multi-Robot Teams , 1999 .

[190]  Manuela M. Veloso,et al.  On Behavior Classification in Adversarial Environments , 2000, DARS.

[191]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[192]  Kyle Wagner,et al.  Cooperative Strategies and the Evolution of Communication , 2000, Artificial Life.

[193]  Sandip Sen,et al.  Evaluating concurrent reinforcement learners , 2000, Proceedings Fourth International Conference on MultiAgent Systems.

[194]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[195]  Luc Steels,et al.  The puzzle of language evolution , 2000, Kognitionswissenschaft.

[196]  Manuela Veloso,et al.  An Analysis of Stochastic Game Theory for Multiagent Reinforcement Learning , 2000 .

[197]  Kee-Eung Kim,et al.  Learning to Cooperate via Policy Search , 2000, UAI.

[198]  Kenneth A. De Jong,et al.  Cooperative Coevolution: An Architecture for Evolving Coadapted Subcomponents , 2000, Evolutionary Computation.

[199]  D. Vengerov,et al.  An Empirical Model of Factor Adjustment Dynamics , 2006 .

[200]  C. Lee Giles,et al.  Talking Helps: Evolving Communicating Agents for the Predator-Prey Pursuit Problem , 2000, Artificial Life.

[201]  Arthur L. Samuel,et al.  Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..

[202]  Richard E. Korf,et al.  On Pruning Techniques for Multi-Player Games , 2000, AAAI/IAAI.

[203]  Bikramjit Banerjee,et al.  Learning Mutual Trust , 2000, Trust in Cyber-societies.

[204]  Shin Ishii,et al.  Multi-agent reinforcement learning: an approach based on the other agent's internal model , 2000, Proceedings Fourth International Conference on MultiAgent Systems.

[205]  Sandip Sen,et al.  Evolving agent socienties that avoid social dilemmas , 2000, GECCO.

[206]  Michael H. Bowling,et al.  Convergence Problems of General-Sum Multiagent Reinforcement Learning , 2000, ICML.

[207]  James P. Crutchfield,et al.  Resource sharing and coevolution in evolving cellular automata , 1999, IEEE Trans. Evol. Comput..

[208]  Martin Lauer,et al.  An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[209]  Lynne E. Parker,et al.  Multi-Robot Learning in a Cooperative Observation Task , 2000, DARS.

[210]  Jordan B. Pollack,et al.  A Game-Theoretic Approach to the Simple Coevolutionary Algorithm , 2000, PPSN.

[211]  Lynne E. Parker,et al.  Current State of the Art in Distributed Autonomous Mobile Robotics , 2000 .

[212]  D. Vengerov,et al.  Learning, Cooperation, and Coordination in Multi-Agent Systems , 2000 .

[213]  Josh C. Bongard,et al.  The Legion System: A Novel Approach to Evolving Hetrogeneity for Collective Problem Solving , 2000, EuroGP.

[214]  Jean-Louis Deneubourg,et al.  From local actions to global tasks: stigmergy and collective robotics , 2000 .

[215]  Kenneth A. De Jong,et al.  Evolving Behaviors for Cooperating Agents , 2000, ISMIS.

[216]  Melanie Mitchell,et al.  Evolving Cellular Automata with Genetic Algorithms: A Review of Recent Work , 2000 .

[217]  Petra Funk,et al.  Multiagentsystems - A Modern Approach to Distributed Artificial Intelligence , 2000, Künstliche Intell..

[218]  Graham Kendall,et al.  An evolutionary approach for the tuning of a chess evaluation function using population dynamics , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[219]  Graham Kendall,et al.  An Investigation of an Adaptive Poker Player , 2001, Australian Joint Conference on Artificial Intelligence.

[220]  Matt Quinn,et al.  A comparison of approaches to the evolution of homogeneous multi-robot teams , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[221]  J. Pollack,et al.  Coevolutionary dynamics in a minimal substrate , 2001 .

[222]  Sandip Sen,et al.  Proceedings of the fifth international conference on Autonomous agents , 2001 .

[223]  Manuela M. Veloso,et al.  Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.

[224]  Adam Szarowicz,et al.  An Improved Q-Learning Algorithm Using Synthetic Pheromones , 2001, CEEMAS.

[225]  H. V. Parunak,et al.  Tuning Synthetic Pheromones With Evolutionary Computing , 2001 .

[226]  Peter Stone,et al.  Keepaway Soccer: A Machine Learning Testbed , 2001, RoboCup.

[227]  Gaurav S. Sukhatme,et al.  Emergent bucket brigading: a simple mechanisms for improving performance in multi-robot constrained-space foraging tasks , 2001, AGENTS '01.

[228]  Thomas Miconi A collective genetic algorithm , 2001 .

[229]  Olivier Buffet,et al.  Multi-Agent Systems by Incremental Gradient Reinforcement Learning , 2001, IJCAI.

[230]  R. Paul Wiegand,et al.  An empirical analysis of collaboration methods in cooperative coevolutionary algorithms , 2001 .

[231]  Michael L. Littman,et al.  Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.

[232]  Julie A. Adams,et al.  Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence , 2001, AI Mag..

[233]  David B. Fogel,et al.  Blondie24: Playing at the Edge of AI , 2001 .

[234]  Tom Lenaerts,et al.  Learning agents in a homo egualis society , 2001 .

[235]  Matthew Quinn,et al.  Evolving Communication without Dedicated Communication Channels , 2001, ECAL.

[236]  David Lazer,et al.  Emergent Actors in World Politics: How States and Nations Develop by Lars-Erik Cederman , 2001, J. Artif. Soc. Soc. Simul..

[237]  Steven M. Gustafson,et al.  Layered Learning in Genetic Programming for a Cooperative Robot Soccer Problem , 2001, EuroGP.

[238]  Kagan Tumer,et al.  Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..

[239]  Dorothy Ndedi Monekosso,et al.  Phe-Q: A Pheromone Based Q-Learning , 2001, Australian Joint Conference on Artificial Intelligence.

[240]  Robert Axelrod,et al.  The Evolution of Strategies in the Iterated Prisoner's Dilemma , 2001 .

[241]  Alan C. Schultz,et al.  Heterogeneity in the Coevolved Behaviors of Mobile Robots: The Emergence of Specialists , 2001, IJCAI.

[242]  Angelo Cangelosi,et al.  Evolution of communication and language using signals, symbols, and words , 2001, IEEE Trans. Evol. Comput..

[243]  Alex Lubberts and Risto Miikkulainen Co-Evolving a Go-Playing Neural network , 2001 .

[244]  Olivier Buffet,et al.  Incremental reinforcement learning for designing multi-agent systems , 2001, AGENTS '01.

[245]  Learning in Large Cooperative Multi-Robot Domains , 2001 .

[246]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[247]  Ronen I. Brafman,et al.  Efficient learning equilibrium , 2004, Artificial Intelligence.

[248]  Dorothy Ndedi Monekosso,et al.  An Analysis of the Pheromone Q-Learning Algorithm , 2002, IBERAMIA.

[249]  Pradeep K. Khosla,et al.  The necessity of average rewards in cooperative multirobot learning , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[250]  Lee Spector,et al.  Using Genetic Programming with Multiple Data Types and Automatic Modularization to Evolve Decentralized and Coordinated Navigation in Multi-Agent Systems , 2002, GECCO Late Breaking Papers.

[251]  Daniel Kudenko,et al.  Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.

[252]  Graham Kendall,et al.  An investigation, using co-evolution, to evolve an Awari player , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[253]  Melanie Mitchell,et al.  A Comparison of Evolutionary and Coevolutionary Search , 2002, Int. J. Comput. Intell. Appl..

[254]  Xiaofeng Wang,et al.  Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games , 2002, NIPS.

[255]  Barbara Webb,et al.  Swarm Intelligence: From Natural to Artificial Systems , 2002, Connect. Sci..

[256]  Daniel Kudenko,et al.  Reinforcement Learning Approaches to Coordination in Cooperative Multi-agent Systems , 2002, Adaptive Agents and Multi-Agents Systems.

[257]  Olivier Buffet,et al.  Learning to weigh basic behaviors in scalable agents , 2002, AAMAS '02.

[258]  Sandip Sen,et al.  Adaptation Using Cases in Cooperative Groups , 2002 .

[259]  H. Van Dyke Parunak,et al.  Evolving adaptive pheromone path planning mechanisms , 2002, AAMAS '02.

[260]  Jeffrey K. Bassett A Study of Generalization Techniques in Evolutionary Rule Learning , 2002 .

[261]  Steven M. Gustafson,et al.  Genetic Programming And Multi-agent Layered Learning By Reinforcements , 2002, GECCO.

[262]  D. Kudenko,et al.  Improving on the reinforcement learning of coordination in cooperative multi-agent systems , 2002 .

[263]  Lynne E. Parker,et al.  Robot Teams: From Diversity to Polymorphism , 2002 .

[264]  Jean Oh,et al.  Electric Elves: Agent Technology for Supporting Human Organizations , 2002, AI Mag..

[265]  Kenneth A. De Jong,et al.  Modeling Variation in Cooperative Coevolution Using Evolutionary Game Theory , 2002, FOGA.

[266]  Akira Hayashi,et al.  A multiagent reinforcement learning algorithm using extended optimal response , 2002, AAMAS '02.

[267]  L. Spector,et al.  Evolutionary Dynamics Discovered via Visualization in the breve Simulation Environment , 2002 .

[268]  Gaurav S. Sukhatme,et al.  Adaptive spatio-temporal organization in groups of robots , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[269]  Santiago Ontañón,et al.  A bartering approach to improve multiagent learning , 2002, AAMAS '02.

[270]  K.A. De Jong,et al.  Analyzing cooperative coevolution with evolutionary game theory , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[271]  Kagan Tumer,et al.  Learning sequences of actions in collectives of autonomous agents , 2002, AAMAS '02.

[272]  Michail G. Lagoudakis,et al.  Coordinated Reinforcement Learning , 2002, ICML.

[273]  R. Paul Wiegand,et al.  Guaranteeing Coevolutionary Objective Measures , 2002, FOGA.

[274]  Lynne E. Parker,et al.  Distributed Algorithms for Multi-Robot Observation of Multiple Moving Targets , 2002, Auton. Robots.

[275]  George Cybenko,et al.  Decentralized control for coordinated flow of multi-agent systems , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[276]  Tom Lenaerts,et al.  A selection-mutation model for q-learning in multi-agent systems , 2003, AAMAS '03.

[277]  Leslie Pack Kaelbling,et al.  All learning is Local: Multi-agent Learning in Global Reward Games , 2003, NIPS.

[278]  Thomas Jansen,et al.  Exploring the Explorative Advantage of the Cooperative Coevolutionary (1+1) EA , 2003, GECCO.

[279]  Sandip Sen,et al.  Towards a pareto-optimal solution in general-sum games , 2003, AAMAS '03.

[280]  Tom Fawcett,et al.  Proceedings, Twentieth International Conference on Machine Learning , 2003 .

[281]  Shimon Whiteson,et al.  Concurrent layered learning , 2003, AAMAS '03.

[282]  David V. Pynadath,et al.  Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.

[283]  Manuela Veloso,et al.  Multiagent learning in the presence of agents with limitations , 2003 .

[284]  William T. B. Uther,et al.  Adversarial Reinforcement Learning , 2003 .

[285]  Sven Koenig,et al.  Trail-laying robots for robust terrain coverage , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[286]  R. Paul Wiegand,et al.  Improving Coevolutionary Search for Optimal Multiagent Behaviors , 2003, IJCAI.

[287]  Keith B. Hall,et al.  Correlated Q-Learning , 2003, ICML.

[288]  Peter Stone,et al.  A polynomial-time nash equilibrium algorithm for repeated games , 2003, EC '03.

[289]  Craig Boutilier,et al.  Coordination in multiagent reinforcement learning: a Bayesian approach , 2003, AAMAS '03.

[290]  Michael P. Wellman,et al.  Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[291]  Jeffrey K. Bassett,et al.  An Analysis of Cooperative Coevolutionary Algorithms A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at George Mason University , 2003 .

[292]  Thomas Miconi When Evolving Populations is Better than Coevolving Individuals: The Blind Mice Problem , 2003, IJCAI.

[293]  R. Paul Wiegand,et al.  A Visual Demonstration of Convergence Properties of Cooperative Coevolution , 2004, PPSN.

[294]  S. Luke,et al.  Ant Foraging Revisited , 2004 .

[295]  Manuela M. Veloso,et al.  Existence of Multiagent Equilibria with Limited Agents , 2004, J. Artif. Intell. Res..

[296]  Bohdana Ratitch,et al.  Multi-agent patrolling with reinforcement learning , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[297]  Daniel Kudenko,et al.  Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[298]  Eugénio C. Oliveira,et al.  Learning from multiple sources , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[299]  Rudolf Paul Wiegand,et al.  An analysis of cooperative coevolutionary algorithms , 2004 .

[300]  R. Paul Wiegand,et al.  A Sensitivity Analysis of a Cooperative Coevolutionary Algorithm Biased for Optimization , 2004, GECCO.

[301]  Nicholas R. Jennings,et al.  A Roadmap of Agent Research and Development , 2004, Autonomous Agents and Multi-Agent Systems.

[302]  Sean Luke,et al.  Learning Ant Foraging Behaviors , 2004 .

[303]  Richard Alterman,et al.  Autonomous Agents that Learn to Better Coordinate , 2004, Autonomous Agents and Multi-Agent Systems.

[304]  Sean Luke,et al.  A pheromone-based utility model for collaborative foraging , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[305]  Michael P. Wellman,et al.  Conjectural Equilibrium in Multiagent Learning , 1998, Machine Learning.

[306]  Jürgen Schmidhuber,et al.  Learning Team Strategies: Soccer Case Studies , 1998, Machine Learning.

[307]  Jeffrey O. Kephart,et al.  Pricing in Agent Economies Using Multi-Agent Q-Learning , 2002, Autonomous Agents and Multi-Agent Systems.

[308]  Edmund H. Durfee,et al.  Predicting the Expected Behavior of Agents that Learn About Agents: The CLRI Framework , 2004, Autonomous Agents and Multi-Agent Systems.

[309]  Jordan B. Pollack,et al.  Co-Evolution in the Successful Learning of Backgammon Strategy , 1998, Machine Learning.

[310]  Jeffrey S. Rosenschein,et al.  Best-response multiagent learning in non-stationary environments , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[311]  Gary B. Parker,et al.  Co-Evolving Team Capture Strategies for Dissimilar Robots , 2004, AAAI Technical Report.

[312]  John J. Grefenstette,et al.  Learning sequential decision rules using simulation models and competition , 2004, Machine Learning.

[313]  Maarten Peeters,et al.  Multi-Agent Learning in Conflicting Multi-Level Games with Incomplete Information , 2004, AAAI Technical Report.

[314]  Yufeng Liu,et al.  Stochastic Direct Reinforcement: Application to Simple Games with Recurrence , 2004, AAAI Technical Report.

[315]  Andrew B. Williams,et al.  Learning to Share Meaning in a Multi-Agent System , 2004, Autonomous Agents and Multi-Agent Systems.

[316]  Elena Popovici,et al.  Understanding Competitive Co-Evolutionary Dynamics via Fitness Landscapes , 2004, AAAI Technical Report.

[317]  Dave Cliff,et al.  Creatures: Entertainment Software Agents with Artificial Life , 2004, Autonomous Agents and Multi-Agent Systems.

[318]  Yoav Shoham,et al.  On the Agenda(s) of Research on Multi-Agent Learning , 2004, AAAI Technical Report.

[319]  Peter Stone,et al.  Multiagent traffic management: a reservation-based intersection control mechanism , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[320]  Phil Husbands,et al.  Ant Foraging Revisited , 2004 .

[321]  Karl Tuyls,et al.  Analyzing Multi-agent Reinforcement Learning Using Evolutionary Dynamics , 2004, ECML.

[322]  Leslie Pack Kaelbling,et al.  Multi-Agent Learning in Mobilized Ad-Hoc Networks , 2004, AAAI Technical Report.

[323]  Sridhar Mahadevan,et al.  Learning to communicate and act using hierarchical reinforcement learning , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[324]  R. Paul Wiegand,et al.  Spatial Embedding and Loss of Gradient in Cooperative Coevolutionary Algorithms , 2004, PPSN.

[325]  Manuela Veloso,et al.  Opportunities for Learning in Multi-Agent Meeting Scheduling , 2004, AAAI Technical Report.

[326]  Lee Spector,et al.  Emergence of Collective Behavior in Evolving Populations of Flying Agents , 2003, Genetic Programming and Evolvable Machines.

[327]  John J. Grefenstette,et al.  Learning Sequential Decision Rules Using Simulation Models and Competition , 1990, Machine Learning.

[328]  Peter Stone,et al.  Multiagent traffic management: an improved intersection control mechanism , 2005, AAMAS '05.

[329]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[330]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[331]  Sean Luke,et al.  Tunably decentralized algorithms for cooperative target observation , 2005, AAMAS '05.

[332]  R. Paul Wiegand,et al.  Robustness in cooperative coevolution , 2006, GECCO '06.

[333]  Sridhar Mahadevan,et al.  Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.

[334]  Kenneth DeJong Evolutionary computation: a unified approach , 2007, GECCO.

[335]  Phil Husbands,et al.  Evolving Formation Movement for a Homogeneous Multi-Robot System: Teamwork and Role-Allocation with Real Robots , 2007 .

[336]  Robert J. Collins,et al.  AntFarm: Towards Simulated Evolution , 2007 .

[337]  Wiering,et al.  Reinforcement Learning Soccer Teamswith Incomplete World , .