论文信息 - A Survey on Multiagent Reinforcement Learning Towards Multi-Robot Systems

A Survey on Multiagent Reinforcement Learning Towards Multi-Robot Systems

Multiagent reinforcement learning for multirobot systems is a challenging issue in both robotics and artificial intelligence. With the ever increasing interests in theoretical research and practical applications, currently there have been a lot of efforts towards providing some solutions to this challenge. However, there are still many difficulties in scaling up multiagent reinforcement learning to multi-robot systems. The main objective of this paper is to provide a survey on multiagent reinforcement learning in multi-robot systems, based on the literature the authors collected. After reviewing some important advances in this field, some challenging problems are analyzed. A concluding remark is made from the perspectives of the authors.

Erfu Yang | Dongbing Gu | Dongbing Gu | Erfu Yang

[1] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[2] S. van der Zwaan,et al. Cooperative learning and planning for multiple robots , 2000, Proceedings of the 2000 IEEE International Symposium on Intelligent Control. Held jointly with the 8th IEEE Mediterranean Conference on Control and Automation (Cat. No.00CH37147).

[3] Akira Hayashi,et al. A multiagent reinforcement learning algorithm using extended optimal response , 2002, AAMAS '02.

[4] Jiming Liu,et al. Multi-agent robotic systems , 2001 .

[5] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[6] Jürgen Schmidhuber,et al. Reinforcement Learning Soccer Teams with Incomplete World Models , 1999, Auton. Robots.

[7] Tucker R. Balch,et al. Behavior-based formation control for multirobot teams , 1998, IEEE Trans. Robotics Autom..

[8] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.

[9] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .

[10] D. Vengerov,et al. An Empirical Model of Factor Adjustment Dynamics , 2006 .

[11] Huosheng Hu,et al. KaBaGe-RL: Kanerva-based generalisation and reinforcement learning for possession football , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[12] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .

[13] Ahmet Arslan,et al. Minimax Fuzzy Q-Learning in Cooperative Multi-agent Systems , 2002, ADVIS.

[14] Majid Nili Ahmadabadi,et al. Expertness based cooperative Q-learning , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[15] Michail G. Lagoudakis,et al. Value Function Approximation in Zero-Sum Markov Games , 2002, UAI.

[16] Michael P. Wellman,et al. Learning about other agents in a dynamic multiagent system , 2001, Cognitive Systems Research.

[17] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.

[18] Vladislav Tadic,et al. On the Convergence of Temporal-Difference Learning with Linear Function Approximation , 2001, Machine Learning.

[19] Reda Alhajj,et al. Multiagent reinforcement learning using function approximation , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[20] Yoav Shoham,et al. Multi-Agent Reinforcement Learning:a critical survey , 2003 .

[21] Csaba Szepesv Ari,et al. Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms , 1996 .

[22] Craig Boutilier,et al. Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[23] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[24] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..

[25] Michael P. Wellman,et al. Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[26] Minoru Asada,et al. Cooperative Behavior Acquisition for Mobile Robots in Dynamically Changing Real Worlds Via Vision-Based Reinforcement Learning and Development , 1999, Artif. Intell..

[27] Ron Sun,et al. Rationality Assumptions and Optimality of Co-learning , 2000, PRIMA.

[28] H.R. Berenji,et al. Cooperation and coordination between fuzzy reinforcement learning agents in continuous state partially observable Markov decision processes , 1999, FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315).

[29] Michael L. Littman,et al. Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.

[30] Bikramjit Banerjee,et al. Convergent Gradient Ascent in General-Sum Games , 2002, ECML.

[31] Eduardo F. Morales,et al. Scaling Up Reinforcement Learning with a Relational Representation , 2003 .

[32] Sandip Sen,et al. Multiagent Coordination with Learning Classifier Systems , 1995, Adaption and Learning in Multi-Agent Systems.

[33] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[34] Jong-Hwan Kim,et al. Modular Q-learning based multi-agent cooperation for robot soccer , 2001, Robotics Auton. Syst..

[35] Andrew B. Kahng,et al. Cooperative Mobile Robotics: Antecedents and Directions , 1997, Auton. Robots.

[36] Rémi Munos,et al. A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions , 2000, Machine Learning.

[37] Fredrik A. Dahl,et al. The Lagging Anchor Algorithm: Reinforcement Learning in Two-Player Zero-Sum Games with Imperfect Information , 2002, Machine Learning.

[38] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[39] Michael P. Wellman,et al. Multiagent Reinforcement Learning in Stochastic Games , 1999, ICML 1999.

[40] Imad H. Elhajj,et al. Design and Analysis of Internet-Based Tele-Coordinated Multi-Robot Systems , 2003, Auton. Robots.

[41] Manuela Veloso,et al. Multiagent learning in the presence of agents with limitations , 2003 .

[42] Hamid R. Berenji,et al. A convergent actor-critic-based FRL algorithm with application to power management of wireless transmitters , 2003, IEEE Trans. Fuzzy Syst..

[43] Daniele Nardi,et al. Distributed Coordination in Heterogeneous Multi-Robot Systems , 2003, Auton. Robots.

[44] François Michaud,et al. Learning from History for Behavior-Based Mobile Robots in Non-Stationary Conditions , 1998, Machine Learning.

[45] Claude F. Touzet,et al. Distributed Lazy Q-Learning for Cooperative Mobile Robots , 2004 .

[46] Georgios Chalkiadakis. Multiagent reinforcement learning: stochastic games with multiple learning players , 2003 .

[47] L. E. ParkerCenter. Learning in Large Cooperative Multi-Robot Domains , 2001 .

[48] Gaurav S. Sukhatme,et al. Multi-Robot Task Allocation in Uncertain Environments , 2003, Auton. Robots.

[49] Claude F. Touzet,et al. Robot Awareness in Cooperative Mobile Robot Learning , 2000, Auton. Robots.

[50] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[51] Peter Stone,et al. Leading Best-Response Strategies in Repeated Games , 2001, International Joint Conference on Artificial Intelligence.

[52] Bikramjit Banerjee,et al. Adaptive policy gradient in multiagent learning , 2003, AAMAS '03.

[53] Csaba Szepesvári,et al. A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.

[54] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[55] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[56] Maja J. Mataric,et al. Learning in behavior-based multi-robot systems: policies, models, and other agents , 2001, Cognitive Systems Research.

[57] Manuela M. Veloso,et al. Simultaneous Adversarial Multi-Robot Learning , 2003, IJCAI.