On the rationality of profit sharing in multi-agent reinforcement learning
暂无分享,去创建一个
Reinforcement learning is a kind of machine learning. It aims to adapt an agent to an unknown environment according to rewards. Traditionally, from a theoretical point of view, many reinforcement learning systems assume that the environment has Markovian properties. However, it is important to treat non-Markovian environments in multi-agent reinforcement learning systems. The authors use Profit Sharing (PS) as a reinforcement learning system and discuss the rationality of PS in multi-agent environments. In particular, we classify non-Markovian environments and discuss how to share a reward among reinforcement learning agents. Through a crane control problem, we confirm the effectiveness of PS in multi-agent environments.
[1] Shigenobu Kobayashi,et al. Rationality of Reward Sharing in Multi-agent Reinforcement Learning , 1999, PRIMA.
[2] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[3] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.