Cost-Based Policy Mapping for Imitation

Imitation represents a powerful approach for programming and autonomous learning in robot and computer systems. An important aspect of imitation is the mapping of observations to an executable control strategy. This is particularly important if the behavioral capabilities of the observed and imitating agent differ significantly. This paper presents an approach that addresses this problem by locally optimizing a cost function representing the deviation from the observed state sequence and the cost of the actions required to perform the imitation. The result are imitation strategies that can be performed by the imitating agent and that as closely as possible resemble the observations of the demonstrating agent. The performance of this approach is illustrated within the context of a simulated multi-agent environment.