LMRL: a multi-agent reinforcement learning model and algorithm

Multi-agent reinforcement learning technologies are mainly investigated from two perspectives: one is from the concurrence, and the other from the game theory. The former chiefly applies to cooperative multi-agent systems, while the latter usually applies to coordinated multi-agent systems. However, There exist some problems such as that of the credit assignment and the multiple Nash equilibriums for agents with them. In this paper, we propose a new multi-agent reinforcement learning model and algorithm LMRL from a layer perspective. LMRL model is composed of an offline training layer that employs a single agent reinforcement learning technology to acquire stationary strategy knowledge and an online interaction layer that employs a multi-agent reinforcement learning technology and the strategy knowledge that can be revised dynamically to interact with environment. An agent with LMRL can improve its generalization, adaptability and coordination ability. Experiments show that the performance of LMRL can be better than that of single agent reinforcement learning and Nash-Q.