论文信息 - Team-Partitioned, Opaque-Transition Reinforced Learning

Team-Partitioned, Opaque-Transition Reinforced Learning

We present a novel multi-agent learning paradigm called team-partitioned, opaque-transition reinforcement learning (TPOT-RL). TPOT-RL introduces the use of action-dependent features to generalize the state space. In our work, we use a learned action-dependent feature space to aid higher-level reinforcement learning. TPOT-RL is an effective technique to allow a team of agents to learn to cooperate towards the achievement of a specific goal. It is an adaptation of traditional RL methods that is applicable in complex, non-Markovian, multi-agent domains with large state spaces and limited training opportunities. TPOT-RL is fully implemented and has been tested in the robotic soccer domain, a complex, multi-agent framework. This paper presents the algorithmic details of TPOT-RL as well as empirical results demonstrating the effectiveness of the developed multi-agent learning approach with learned features.

Manuela M. Veloso | Peter Stone | M. Veloso | P. Stone

[1] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[2] Ming Tan,et al. Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[3] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.

[4] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[5] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[6] Hitoshi Matsubara,et al. Learning Cooperative Behavior in Multi-Agent Environment - A Case Study of Choice of Play-Plans in Soccer , 1996, PRICAI.

[7] James A. Hendler,et al. Co-evolving Soccer Softbot Team Coordination with Genetic Programming , 1997, RoboCup.

[8] Manuela M. Veloso,et al. Using Decision Tree Confidence Factors for Multiagent Control , 1997, RoboCup.

[9] Manuela M. Veloso,et al. The CMUnited-97 Small Robot Team , 1997, RoboCup.

[10] Hiroaki Kitano,et al. RoboCup: A Challenge Problem for AI , 1997, AI Mag..

[11] Manuela M. Veloso,et al. Layered Approach to Learning Client Behaviors in the Robocup Soccer Server , 1998, Appl. Artif. Intell..

[12] Manuela M. Veloso,et al. Using decision tree confidence factors for multi-agent control , 1998, AGENTS '98.

[13] Manuela M. Veloso,et al. Towards collaborative and adversarial learning: a case study in robotic soccer , 1998, Int. J. Hum. Comput. Stud..