论文信息 - Construction of a learning agent handling its rewards according to environmental situations

Construction of a learning agent handling its rewards according to environmental situations

The authors aim at constructing an agent that learns appropriate actions in a Multi-Agent environment with and without social dilemmas. The agent ought to voluntarily give up its profit in a dilemma situation and it should keep its profit in another situation. We divide the environment into three situations and introduce reward-handling manners for learning actions, which are effective in each situation. Since the agent must select an effective manner for the situation, the authors contrive criteria for recognizing the situation. This paper shows that the agent having the manners and the criteria acts well in two of the three Multi-Agent situations composed of homogeneous agents.

Masayuki Numao | Koichi Moriyama

[1] Masayuki Numao,et al. Constructing an Autonomous Agent with an Interdependent Heuristics , 2000, PRICAI.

[2] S. Mikami. Cooperative reinforcement learning by Payoff filters , 1995 .

[3] Xin Yao,et al. An Experimental Study of N-Person Iterated Prisoner's Dilemma Games , 1993, Informatica.

[4] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[5] G. Hardin,et al. The Tragedy of the Commons , 1968, Green Planet Blues.

[6] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.