暂无分享,去创建一个
Javier González | Angel Martínez-Tenor | Ana Cruz-Martín | Juan-Antonio Fernández-Madrigal | Javier González | J. Fernández-Madrigal | Angel Martínez-Tenor | A. Cruz-Martín
[1] Günther Palm,et al. Value-Difference Based Exploration: Adaptive Control between Epsilon-Greedy and Softmax , 2011, KI.
[2] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[3] Patrick M. Pilarski,et al. True Online Temporal-Difference Learning , 2015, J. Mach. Learn. Res..
[4] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.
[5] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[6] Gary G. Yen,et al. Reinforcement learning algorithms for robotic navigation in dynamic environments. , 2004, ISA transactions.
[7] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[8] Stefan Wermter,et al. Real-world reinforcement learning for autonomous humanoid robot docking , 2012, Robotics Auton. Syst..
[9] Patrick M. Pilarski,et al. Model-Free reinforcement learning with continuous action in practice , 2012, 2012 American Control Conference (ACC).
[10] Jan Peters,et al. Learning Motor Skills - From Algorithms to Robot Experiments , 2013, Springer Tracts in Advanced Robotics.
[11] Handy Wicaksono. Q learning behavior on autonomous navigation of physical robot , 2011, 2011 8th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI).
[12] Sidney Nascimento Givigi,et al. Multiple Model Q-Learning for Stochastic Asynchronous Rewards , 2016, J. Intell. Robotic Syst..
[13] Marco Wiering,et al. Reinforcement Learning , 2014, Adaptation, Learning, and Optimization.
[14] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .
[15] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[16] Maya Cakmak,et al. Towards grounding concepts for transfer in goal learning from demonstration , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).
[17] Athanasios S. Polydoros,et al. Survey of Model-Based Reinforcement Learning: Applications on Robotics , 2017, J. Intell. Robotic Syst..
[18] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[19] Gheorghe Mogan,et al. Neural networks based reinforcement learning for mobile robots obstacle avoidance , 2016, Expert Syst. Appl..
[20] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[21] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[22] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[23] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[24] Carlos V. Regueiro,et al. Learning on real robots from experience and simple user feedback , 2013 .
[25] R. Bellman. Dynamic programming. , 1957, Science.
[26] Hado van Hasselt,et al. Reinforcement Learning in Continuous State and Action Spaces , 2012, Reinforcement Learning.
[27] Carl E. Rasmussen,et al. Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[28] Morgan Quigley,et al. ROS: an open-source Robot Operating System , 2009, ICRA 2009.
[29] Surya P. N. Singh,et al. V-REP: A versatile and scalable robot simulation framework , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[30] J. Tukey. Comparing individual means in the analysis of variance. , 1949, Biometrics.
[31] Alborz Geramifard,et al. RLPy: a value-function-based reinforcement learning framework for education and research , 2015, J. Mach. Learn. Res..
[32] Alborz Geramifard,et al. Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping , 2008, UAI.
[33] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[34] Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.
[35] Richard S. Sutton,et al. True Online TD(lambda) , 2014, ICML.
[36] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[37] Chris Gaskett,et al. Q-Learning for Robot Control , 2002 .
[38] A. M. Turing,et al. Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.
[39] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[40] Marco Wiering,et al. Explorations in efficient reinforcement learning , 1999 .
[41] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[42] Peter Vrancx,et al. Reinforcement Learning: State-of-the-Art , 2012 .
[43] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[44] Reinaldo A. C. Bianchi,et al. Transferring knowledge as heuristics in reinforcement learning: A case-based approach , 2015, Artif. Intell..
[45] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[46] Darwin G. Caldwell,et al. Reinforcement Learning in Robotics: Applications and Real-World Challenges , 2013, Robotics.
[47] Peter Stone,et al. RTMBA: A Real-Time Model-Based Reinforcement Learning Architecture for robot control , 2011, 2012 IEEE International Conference on Robotics and Automation.
[48] Peter I. Corke,et al. Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control , 2015, ICRA 2015.
[49] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[50] M. Botvinick. Hierarchical reinforcement learning and decision making , 2012, Current Opinion in Neurobiology.
[51] Olivier Sigaud,et al. Towards Deep Developmental Learning , 2016, IEEE Transactions on Cognitive and Developmental Systems.
[52] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[53] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[54] Leslie Pack Kaelbling,et al. Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[55] R Bellman,et al. DYNAMIC PROGRAMMING AND LAGRANGE MULTIPLIERS. , 1956, Proceedings of the National Academy of Sciences of the United States of America.
[56] Tomás Svoboda,et al. Safe Exploration Techniques for Reinforcement Learning - An Overview , 2014, MESAS.
[57] Peter Stone,et al. TEXPLORE: real-time sample-efficient reinforcement learning for robots , 2012, Machine Learning.
[58] A. H. Klopf,et al. Brain Function and Adaptive Systems: A Heterostatic Theory , 1972 .
[59] Pierre-Yves Oudeyer,et al. Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress , 2012, NIPS.
[60] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[61] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[62] Brian Tanner,et al. RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments , 2009, J. Mach. Learn. Res..
[63] Giulio Sandini,et al. Developmental robotics: a survey , 2003, Connect. Sci..
[64] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[65] Manuel Lopes,et al. Learning exploration strategies in model-based reinforcement learning , 2013, AAMAS.
[66] Peter Stone,et al. Transfer learning for reinforcement learning on a physical robot , 2010, AAMAS 2010.
[67] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[68] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.