Reward and Diversity in Multirobot Foraging

This research seeks to quantify the impact of the choice of reward function on behavioral diversity in learning robot teams The methodology developed for this work has been applied to multirobot forag ing soccer and cooperative movement This paper focuses speci cally on results in multirobot forag ing In these experiments three types of reward are used with Q learning to train a multirobot team to forage a local performance based reward a global performance based reward and a heuristic strategy referred to as shaped reinforcement Local strate gies provide each agent a speci c reward according to its own behavior while global rewards provide all the agents on the team the same reward simul taneously Shaped reinforcement provides a heuris tic reward for an agent s action given its situation The experiments indicate that local performance based rewards and shaped reinforcement generate statistically similar results they both provide the best performance and the least diversity Finally learned policies are demonstrated on a team of No madic Technologies Nomad robots