A survey of reinforcement learning research and its application for multi-robot systems
暂无分享,去创建一个
Xia Yang | Yang Yuequan | Jin Lu | Cao Zhiqiang | Tang Hongru | Ni Chunbo
[1] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[2] R. Arkin,et al. Behavioral diversity in learning robot teams , 1998 .
[3] Manuela Veloso,et al. Tree based hierarchical reinforcement learning , 2002 .
[4] Benjamin Kuipers,et al. Qualitative and Quantitative Simulation: Bridging the Gap , 1997, Artif. Intell..
[5] Zonghai Chen,et al. Grey Reinforcement Learning for Incomplete Information Processing , 2006, TAMC.
[6] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[7] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .
[8] Ronald A. Howard,et al. Dynamic Probabilistic Systems , 1971 .
[9] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[10] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[11] Sumit Mukhopadhyay,et al. A Behavior-based Approach for Multi-agent Q-learning for Autonomous Exploration , 2011, ArXiv.
[12] Sandip Sen,et al. Learning to Coordinate without Sharing Information , 1994, AAAI.
[13] Daoyi Dong,et al. Hybrid Control for Robot Navigation - A Hierarchical Q-Learning Algorithm , 2008, IEEE Robotics & Automation Magazine.
[14] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[15] Bernhard Hengst,et al. Discovering hierarchy in reinforcement learning , 2003 .
[16] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[17] Mengchun Xie. Representation of the perceived environment and acquisition of behavior rule for multi-agent systems by Q-learning , 2000, 2009 4th International Conference on Autonomous Robots and Agents.
[18] Bir Bhanu,et al. Real-time robot learning , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[19] K. Fu,et al. A heuristic approach to reinforcement learning control systems , 1965 .
[20] Pradeep K. Khosla,et al. The necessity of average rewards in cooperative multirobot learning , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[21] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[22] Margaret Mary Skelly,et al. Hierarchical Reinforcement Learning with Function Approximation for Adaptive Control , 2004 .
[23] Yunyi Jia,et al. Coordinated formation control for multi-robot systems with communication constraints , 2011, 2011 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM).
[24] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[25] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .