Learning Situation-Specific Coordination in Cooperative Multi-agent Systems

Achieving effective cooperation in a multi-agent system is a difficult problem for a number of reasons such as limited and possibly out-dated views of activities of other agents and uncertainty about the outcomes of interacting non-local tasks. In this paper, we present a learning system called COLLAGE, that endows the agents with the capability to learn how to choose the most appropriate coordination strategy from a set of available coordination strategies. COLLAGE relies on meta-level information about agents' problem solving situations to guide them towards a suitable choice for a coordination strategy. We present empirical results that strongly indicate the effectiveness of the learning algorithm.

[1]  Richard S. Sutton,et al.  Learning and Sequential Decision Making , 1989 .

[2]  Victor Lesser,et al.  Quantitative Modeling of Complex Environments , 1993 .

[3]  Edmund H. Durfee,et al.  Predictability Versus Responsiveness: Coordinating Problem Solvers in Dynamic Domains , 1988, AAAI.

[4]  W. W. Daniel,et al.  Applied Nonparametric Statistics , 1978 .

[5]  Victor Lesser,et al.  A Negotiation-based Interface Between a Real-time Scheduler and a Decision-Maker , 1994 .

[6]  Victor Lesser,et al.  Learning Organizational Roles in a Heterogeneous Multi-agent System , 1995 .

[7]  Victor R. Lesser,et al.  A retrospective view of FA/C distributed problem solving , 1991, IEEE Trans. Syst. Man Cybern..

[8]  J. Grefenstette The Evolution of Strategies for Multi-agent Environments , 1987 .

[9]  Maja J. Matarić,et al.  Leaning to behave socially , 1994 .

[10]  Victor R. Lesser Evolution of the GPGP/TÆMS domain-independent coordination framework , 2002, AAMAS.

[11]  Manfred Nagl,et al.  A Tutorial and Bibliographical Survey on Graph Grammars , 1978, Graph-Grammars and Their Application to Computer Science and Biology.

[12]  T. Malone,et al.  Toward an interdisciplinary theory of coordination , 2003 .

[13]  Moshe Tennenholtz,et al.  On the Synthesis of Useful Social Laws for Artificial Agent Societies (Preliminary Report) , 1992, AAAI.

[14]  John J. Grefenstette,et al.  The Evolution of Strategies for Multiagent Environments , 1992, Adapt. Behav..

[15]  Victor R. Lesser,et al.  Learning to Improve Coordinated Actions in Cooperative Distributed Problem-Solving Environments , 1998, Machine Learning.

[16]  Sati S. Sian,et al.  Extending Learning to Multiple Agents: Issues and a Model for Multi-Agent Machine Learning (MA-ML) , 1991, EWSL.

[17]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[18]  K. Decker,et al.  Evolution of the GPGP Domain-Independent Coordination Framework , 1998 .

[19]  Sandip Sen,et al.  Learning cases to resolve conflicts and improve group behavior , 1998, Int. J. Hum. Comput. Stud..

[20]  Victor Lesser,et al.  Learning Situation-specific Coordination in Generalized Partial Global Planning , 1996 .

[21]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[22]  A. Stinchcombe Information and Organizations , 2019 .

[23]  Mark S. Boddy,et al.  Solving Time-Dependent Planning Problems , 1989, IJCAI.

[24]  Mark S. Fox,et al.  An Organizational View of Distributed Systems , 1988, IEEE Transactions on Systems, Man, and Cybernetics.

[25]  Victor Lesser,et al.  Environment Centered Analysis and Design of Coordination Mechanisms , 1996 .

[26]  Robert H. Crites,et al.  Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.

[27]  P. Lawrence,et al.  Organization and environment , 1967 .

[28]  I. Gilboa,et al.  Case-Based Decision Theory , 1995 .

[29]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[30]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[31]  Victor R. Lesser,et al.  Extending a blackboard architecture for approximate processing , 1990, Real-Time Systems.

[32]  Victor R. Lesser,et al.  Quantitative Modeling of Complex Computational Task Environments , 1993, AAAI.

[33]  Victor R. Lesser,et al.  Designing a Family of Coordination Algorithms , 1997, ICMAS.

[34]  Richard Alterman,et al.  Multiagent Learning through Collective Memory , 1996 .

[35]  Scott Huston Mullins,et al.  Grammatical approaches to engineering design, part I: An introduction and commentary , 1991 .

[36]  Gerhard Weiß,et al.  A multiagent perspective of parallel and distributed machine learning , 1998, AGENTS '98.

[37]  Toshiharu Sugawara,et al.  On-Line Learning of Coordination Plans , 1993 .

[38]  John H. Holland,et al.  Properties of the bucket brigade algorithm , 1985 .

[39]  Bernard Silver,et al.  A Framework for Multi-Paradigmatic Learning , 1990, ML.

[40]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[41]  Victor R. Lesser,et al.  Distributed Interpretation: A Model and Experiment , 1980, IEEE Transactions on Computers.

[42]  Victor R. Lesser,et al.  Design-to-time real-time scheduling , 1993, IEEE Trans. Syst. Man Cybern..

[43]  Gerhard Weiss,et al.  Some Studies in Distributed Machine Learning and Organizational Design , 1994 .

[44]  M. V. Nagendra Prasad Exploring Organizational Designs with T.1EMS: A Case Study of Distributed Data Processing* , 2001 .

[45]  Victor R. Lesser,et al.  Cooperative Learning over Composite Search Spaces: Experiences with a Multi-Agent Design System , 1996, AAAI/IAAI, Vol. 1.

[46]  John H. Holland,et al.  Properties of the Bucket Brigade , 1985, ICGA.

[47]  Manfred Nagl,et al.  Graph-Grammars and Their Application to Computer Science , 1986, Lecture Notes in Computer Science.

[48]  Jeffrey S. Rosenschein,et al.  Designing Conventions for Automated Negotiation , 1994, AI Mag..

[49]  King-Sun Fu,et al.  Tree graph grammars for pattern recognition , 1982, Graph-Grammars and Their Application to Computer Science.

[50]  M. Matarić Learning to Behave Socially , 1994 .