论文信息 - Efficient skill learning using abstraction selection - 字舞流文

Efficient skill learning using abstraction selection

We present an algorithm for selecting an appropriate abstraction when learning a new skill. We show empirically that it can consistently select an appropriate abstraction using very little sample data, and that it significantly improves skill learning performance in a reasonably large real-valued reinforcement learning domain.

Andrew G. Barto | George Konidaris | A. Barto | G. Konidaris

[1] G. Schwarz. Estimating the Dimension of a Model , 1978 .

[2] Maja J. Matarić,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .

[3] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.

[4] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[5] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[6] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.

[7] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[8] Andrew G. Barto,et al. Automated State Abstraction for Options using the U-Tree Algorithm , 2000, NIPS.

[9] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.

[10] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[11] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[12] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.

[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[14] Christos G. Cassandras,et al. Discrete-Event Systems , 2005, Handbook of Networked and Embedded Control Systems.

[15] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[16] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.

[17] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[18] Bram Bakker,et al. Reinforcement Learning with Multiple, Qualitatively Different State Representations , 2007 .

[19] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.

[20] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.