General Terms Algorithms

Immediate rewards play a key role in a reinforcement learning (RL) scenario as they help the system deal with the credit assignment problem. Therefore, reward function definition has a drastic effect on both how fast the system learns and to what policy it converges. It becomes even more important in case of multi-agent learning, where the state space usually gets even bigger. We propose a Genetic Algorithms (GA) based reward function shaping method for multi-robot learning problems and evaluate its performance in a robot soccer case study. A set of metrics calculated from the positions of the players and the ball on the field are used as the primitive building blocks of an immediate reward function, which is defined as a weighted combination of these metrics obtained using GA, yielding a significantly better soccer playing performance.

[1]  H. Levent Akin,et al.  Soccer without intelligence , 2009, 2008 IEEE International Conference on Robotics and Biomimetics.

[2]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[3]  Çetin Meriçli,et al.  A Layered Metric Definition and Evaluation Framework for Multirobot Systems , 2008, RoboCup.

[4]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[5]  V. Braitenberg Vehicles, Experiments in Synthetic Psychology , 1984 .