Cooperation-eliciting prisoner's dilemma payoffs for reinforcement learning agents
暂无分享,去创建一个
This work considers a stateless Q-learning agent in iterated Prisoner's Dilemma(PD). We have already given a condition of PD payoffs and Q-learning parameters that helps stateless Q-learning agents cooperate with each other. That condition, however, has a restrictive premise. This work relaxes the premise and shows a new payoff condition for mutual cooperation. After that, we derive the payoff relations that will elicit mutual cooperation from the new condition.
[1] W. Hamilton,et al. The evolution of cooperation. , 1984, Science.
[2] Koichi Moriyama,et al. Utility based Q-learning to facilitate cooperation in Prisoner's Dilemma games , 2009, Web Intell. Agent Syst..
[3] Robert H. Crites,et al. Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.