论文信息 - Action Selection and Action Control for Playing Table Soccer Using Markov Decision Processes Master Thesis

Action Selection and Action Control for Playing Table Soccer Using Markov Decision Processes Master Thesis

StarKick is a commercially available table soccer robot which challenges even advanced human players. However, the available set of actions for StarKick is limited and the way of selecting the actions is not flexible enough for incorporating more elaborate actions. In the context of this thesis, new actions for stopping and dribbling the ball are developed. Stopping is achieved by locking the ball between the playing surface and a playing figure. Dribbling makes the ball rolling at a controllable speed within the reachable area of the playing figures of one rod. By these new actions, the ball can be deliberately passed and stopped. To decide, which action should be taken in a given situation, an action selection scheme using Markov Decision Processes (MDPs) and reinforcement learning is developed. In order to reduce the state space, the basic actions are combined to more complex actions and the MDP is structured into four modules. Each module contains a set of states, and the actions that are applicable in these states. A simple reinforcement learning algorithm is implemented in the MDP framework. The transition probabilities are updated by counting. These updates are spread by policy iteration algorithm during a game. A series of experiments are carried out in real table soccer games. These experiments show that the newly developed actions are robust, the MDP model works fine, and the reinforcement learning improves the performance of StarKick in a simplified game. The attempt of making the reinforcement learning work in the whole game seems too tedious to be finished in real games.

Bernhard Nebel | B. Nebel

[1] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.

[2] Bernhard Nebel,et al. Decision-Theoretic Planning for Playing Table Soccer , 2004, KI.

[3] B Ravindran,et al. A tutorial survey of reinforcement learning , 1994 .

[4] Thilo Weigel,et al. KiRo - A Table Soccer Robot Ready for the Market , 2005, Künstliche Intell..

[5] Hector Geffner. Modelling Intelligent Behaviour: The Markov Decision Process Approach , 1998, IBERAMIA.

[6] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[7] Bernhard Nebel,et al. KiRo - An Autonomous Table Soccer Player , 2002, RoboCup.

[8] Craig Boutilier,et al. Stochastic dynamic programming with factored representations , 2000, Artif. Intell..

[9] Hiroaki Kitano,et al. RoboCup: A Challenge Problem for AI , 1997, AI Mag..

[10] Bernhard Nebel,et al. Adaptive Vision for Playing Table Soccer , 2004, KI.

[11] J. Mathias,et al. Program , 1970, Symposium on VLSI Technology.

[12] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.