Online model-learning algorithm from samples and trajectories
暂无分享,去创建一个
[1] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[2] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[3] Fei Hu,et al. Intelligent Spectrum Management Based on Transfer Actor-Critic Learning for Rateless Transmissions in Cognitive Radio Networks , 2018, IEEE Transactions on Mobile Computing.
[4] Saeed Khodaygan,et al. Optimal path-planning for mobile robots to find a hidden target in an unknown environment based on machine learning , 2019, J. Ambient Intell. Humaniz. Comput..
[5] Quan Liu,et al. Efficient reinforcement learning in continuous state and action spaces with Dyna and policy approximation , 2018, Frontiers of Computer Science.
[6] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[7] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[8] Guillaume Lample,et al. Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.
[9] Zhao Li,et al. Improving selection strategies in zeroth-level classifier systems based on average reward reinforcement learning , 2018 .
[10] Jing Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..
[11] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[12] Robert Babuska,et al. Efficient Model Learning Methods for Actor–Critic Control , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[13] Alborz Geramifard,et al. Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping , 2008, UAI.
[14] Michael L. Littman,et al. Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.
[15] Martial Hebert,et al. Improved Learning of Dynamics Models for Control , 2016, ISER.
[16] Robert Babuska,et al. Model learning actor-critic algorithms: Performance evaluation in a motion control task , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).
[17] Roland Siegwart,et al. Control of a Quadrotor With Reinforcement Learning , 2017, IEEE Robotics and Automation Letters.
[18] Martial Hebert,et al. Improving Multi-Step Prediction of Learned Time Series Models , 2015, AAAI.
[19] Dazi Li,et al. Sustainable ℓ2-regularized actor-critic based on recursive least-squares temporal difference learning , 2017, 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC).
[20] Qinglai Wei,et al. Data-Driven Zero-Sum Neuro-Optimal Control for a Class of Continuous-Time Unknown Nonlinear Systems With Disturbance Using ADP , 2016, IEEE Transactions on Neural Networks and Learning Systems.