Using Options to Accelerate Learning of New Tasks According to Human Preferences
暂无分享,去创建一个
Rodrigo Cesar Bonini | Felipe Leno da Silva | Edison Spina | Anna Helena Reali Costa | Edison Spina | R. Bonini
[1] Qun Zong,et al. Self-adaptive multi-objective optimization method design based on agent reinforcement learning for elevator group control systems , 2010, 2010 8th World Congress on Intelligent Control and Automation.
[2] Andrew G. Barto,et al. PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning , 2002, ICML.
[3] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[4] Ann Nowé,et al. Scalarized multi-objective reinforcement learning: Novel design techniques , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[5] Matthew E. Taylor,et al. Multi-objectivization of reinforcement learning problems by reward shaping , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).
[6] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[7] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[8] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[9] David Fairbairn,et al. Mobile Geographic Information Handling Technologies to Support Disaster Management , 2003 .
[10] Gerald Tesauro,et al. TD-Gammon: A Self-Teaching Backgammon Program , 1995 .
[11] Anna Helena Reali Costa,et al. Stochastic Abstract Policies: Generalizing Knowledge to Improve Reinforcement Learning , 2015, IEEE Transactions on Cybernetics.
[12] Dewen Hu,et al. Multiobjective Reinforcement Learning: A Comprehensive Overview , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.
[13] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..