Cyclic error correction based Q-learning for mobile robots navigation
暂无分享,去创建一个
[1] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[2] Abhijit Gosavi,et al. Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning , 2003 .
[3] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[4] Qingquan Li,et al. Reinforcement learning control for coordinated manipulation of multi-robots , 2015, Neurocomputing.
[5] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[6] Wolfram Burgard,et al. Coordinated multi-robot exploration , 2005, IEEE Transactions on Robotics.
[7] Cha Zhang,et al. Ensemble Machine Learning , 2012 .
[8] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[9] Yang Liu,et al. A new Q-learning algorithm based on the metropolis criterion , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[10] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[11] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[12] Ana L. C. Bazzan,et al. The Wisdom of Crowds in Bioinformatics: What Can We Learn (and Gain) from Ensemble Predictions? , 2013, AAAI.
[13] Ying Wang,et al. A reinforcement learning based robotic navigation system , 2014, 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC).
[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[15] José del R. Millán,et al. Learning to Avoid Obstacles Through Reinforcement , 1991, ML.
[16] Patrick M. Pilarski,et al. Tuning-free step-size adaptation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] F.L. Lewis,et al. Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.
[18] Günther Palm,et al. Value-Difference Based Exploration: Adaptive Control between Epsilon-Greedy and Softmax , 2011, KI.
[19] Michael L. Littman,et al. Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.
[20] Koichi Moriyama,et al. Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.
[21] Lynne E. Parker,et al. A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains , 2005, J. Intell. Robotic Syst..