Best-Response Learning of Team Behaviour in Quake III

This paper proposes a mechanism for learning a best-response strategy to improve opponent intelligence in team-oriented commercial computer games. The mechanism, called TEAM2, is an extension of the TEAM mechanism for team-oriented adaptive behaviour explored in[Bakkes et al., 2004] and focusses on the exploitation of relevant gameplay experience. We compare the performance of the TEAM2 mechanism with that of the original TEAM mechanism in simulation studies. The results show the TEAM2 mechanism to be better able to learn team behaviour. We argue that the application as an online learning mechanism is hampered by occasional very long learning times due to an improper balance between exploitation and exploration. We conclude that TEAM2 improves opponent behaviour in team-oriented games and that for online learning the balance between exploitation and exploration is of main importance.