The application of reinforcement learning algorithms to Partially Observable Stochastic Games (POSG) is challenging since each agent does not have access to the whole state information and, in case of concurrent learners, the environment has non-stationary dynamics. These problems could be partially overcome if the policies followed by the other agents were known, and, for this reason, many approaches try to estimate them through the so-called opponent modeling techniques. Although many researches have been devoted to the study of the accuracy of the estimation of opponents’ policies, still little attention has been deserved to understand in which situations these model estimations can be actually useful to improve the agent’s performance. This paper presents a preliminary study about the impact of using opponent modeling techniques to learn the solution of a POSG. Our main purpose is to provide a measure of the gain in performance that can be obtained by exploiting information about the policy of other agents, and how this gain is affected by the accuracy of the estimated models. Our analysis focus on a small two-agent POSG: the Kuhn Poker, a simplified version of classical poker. Three cases will be considered according to the agent knowledge about the opponent’s policy: no knowledge, perfect knowledge, and imperfect knowledge. The aim is to identify which is the maximum error that can affect the model estimate without leading to a performance lower than that reachable without using opponent-modeling information. Finally, we will show how the results of this analysis can be used to improve the performance of a reinforcement-learning algorithm coped with a simple opponent modeling technique.
[1]
H. Kuhn.
9. A SIMPLIFIED TWO-PERSON POKER
,
1951
.
[2]
Jonathan Schaeffer,et al.
Opponent Modeling in Poker
,
1998,
AAAI/IAAI.
[3]
Avi Pfeffer,et al.
Representations and Solutions for Game-Theoretic Problems
,
1997,
Artif. Intell..
[4]
Terence Conrad Schauenberg,et al.
Opponent Modelling and Search in Poker
,
2006
.
[5]
Richard S. Sutton,et al.
Reinforcement Learning
,
1992,
Handbook of Machine Learning.
[6]
Richard S. Sutton,et al.
Reinforcement Learning: An Introduction
,
1998,
IEEE Trans. Neural Networks.
[7]
Peter McCracken,et al.
Safe Strategies for Agent Modelling in Games
,
2004,
AAAI Technical Report.
[8]
Jonathan Schaeffer,et al.
The challenge of poker
,
2002,
Artif. Intell..