论文信息 - Using Options to Accelerate Learning of New Tasks According to Human Preferences - 字舞流文

Using Options to Accelerate Learning of New Tasks According to Human Preferences

Rodrigo Cesar Bonini | Felipe Leno da Silva | Edison Spina | Anna Helena Reali Costa | Edison Spina | R. Bonini

[1] Qun Zong,et al. Self-adaptive multi-objective optimization method design based on agent reinforcement learning for elevator group control systems , 2010, 2010 8th World Congress on Intelligent Control and Automation.

[2] Andrew G. Barto,et al. PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning , 2002, ICML.

[3] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.

[4] Ann Nowé,et al. Scalarized multi-objective reinforcement learning: Novel design techniques , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[5] Matthew E. Taylor,et al. Multi-objectivization of reinforcement learning problems by reward shaping , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[6] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.

[7] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[8] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.

[9] David Fairbairn,et al. Mobile Geographic Information Handling Technologies to Support Disaster Management , 2003 .

[10] Gerald Tesauro,et al. TD-Gammon: A Self-Teaching Backgammon Program , 1995 .

[11] Anna Helena Reali Costa,et al. Stochastic Abstract Policies: Generalizing Knowledge to Improve Reinforcement Learning , 2015, IEEE Transactions on Cybernetics.

[12] Dewen Hu,et al. Multiobjective Reinforcement Learning: A Comprehensive Overview , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[13] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..