Situation-Dependent Utility in Extended Behavior Networks

In this paper, we present a modification of extended behavior networks that enables an agent to learn the relationship between world states and undertaken actions. To this end, we introduce a situation-dependent utility value that is based on the observation of effects after the execution of an action. The utility values serve as bases of multi-dimensional interpolation functions, and supports the revised and extended action selection mechanism to take better actions over time. The evaluation shows that our approach improves action selection. We assess the performance of our system in the RoboCup domain using simulation.