论文信息 - A novel multi-agent Q-learning algorithm in cooperative multi-agent system

A novel multi-agent Q-learning algorithm in cooperative multi-agent system

Q-learning can provide a robust and natural means for agents to learn how to coordinate their action choices in multi-agent systems. We examine some of the factors that can influence the dynamics of the learning process in such a setting. We first distinguish reinforcement learners that are unaware of (or ignore) the presence of other agents from those that explicitly attempt to learn the value of joint actions and the strategies of their counterparts. We study Q-learning in cooperative multi-agent systems under these two perspectives, focusing on the convergence to Nash equilibrium. We propose an exploration strategy to increase the likelihood of convergence to an optimal equilibrium.

[1] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[2] E. Kalai,et al. Rational Learning Leads to Nash Equilibrium , 1993 .

[3] Richard Wheeler,et al. Decentralized learning in finite Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.

[4] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[5] K. Narendra,et al. Decentralized learning in finite Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.

[6] Sandip Sen,et al. Learning to Coordinate without Sharing Information , 1994, AAAI.

[7] Junling Hu,et al. Self-fulfilling Bias in Multiagent Learning , 1996 .