论文信息 - Learning to Coordinate without Sharing Information

Learning to Coordinate without Sharing Information

Researchers in the field of Distributed Artificial Intelligence (DAI) have been developing efficient mechanisms to coordinate the activities of multiple autonomous agents. The need for coordination arises because agents have to share resources and expertise required to achieve their goals. Previous work in the area includes using sophisticated information exchange protocols, investigating heuristics for negotiation, and developing formal models of possibilities of conflict and cooperation among agent interests. In order to handle the changing requirements of continuous and dynamic environments, we propose learning as a means to provide additional possibilities for effective coordination. We use reinforcement learning techniques on a block pushing problem to show that agents can learn complimentary policies to follow a desired path without any knowledge about each other. We theoretically analyze and experimentally verify the effects of learning rate on system convergence, and demonstrate benefits of using learned coordination knowledge on similar problems. Reinforcement learning based coordination can be achieved in both cooperative and non-cooperative domains, and in domains with noisy communication channels and other stochastic characteristics that present a formidable challenge to using other coordination schemes.

[1] C. Raymond Perrault,et al. Elements of a Plan-Based Theory of Speech Acts , 1979, Cogn. Sci..

[2] Reid G. Smith,et al. The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver , 1980, IEEE Transactions on Computers.

[3] Mark S. Fox,et al. An Organizational View of Distributed Systems , 1988, IEEE Transactions on Systems, Man, and Cybernetics.

[4] John H. Holland,et al. Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .

[5] Jeffrey S. Rosenschein,et al. Cooperation without Communication , 1986, AAAI.

[6] Alan H. Bond,et al. Readings in Distributed Artificial Intelligence , 1988 .

[7] Richard S. Sutton,et al. Sequential Decision Problems and Neural Networks , 1989, NIPS 1989.

[8] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[9] Thomas G. Dietterich,et al. Readings in Machine Learning , 1991 .

[10] Luís Torgo,et al. Panel: Learning in Distributed Systems and Multi-Agent Environments , 1991, EWSL.

[11] Edmund H. Durfee,et al. Partial global planning: a coordination framework for distributed hypothesis formation , 1991, IEEE Trans. Syst. Man Cybern..

[12] Ming Tan,et al. Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[13] Gerhard Weiss,et al. Learning to Coordinate Actions in Multi-Agent-Systems , 1993, IJCAI.