Context-sensitive reward shaping for sparse interaction MAS (abstract)

This paper describes the use of context aware potential functions in a multi-agent system in which the interactions between agents are sparse to guide agents towards the desired solutions. These sparse interactions mean that the interactions between the agents only occur sporadically, unknown to the agents a priori, in certain regions of the state space. During these interactions, agents need to coordinate in order to reach the global optimal solution. We demonstrate how different reward shaping functions can be used on top of Future Coordinating Q-learning (FCQ); an algorithm capable of automatically detecting when agents should take each other into consideration. Using FCQ-learning, coordination problems can even be anticipated before the actual problems occur, allowing the problems to be solved timely. We evaluate our approach on a range of gridworld problems, as well as a simulation of Air Traffic Control.