论文信息 - Potential-based Shaping in Model-based Reinforcement Learning

Potential-based Shaping in Model-based Reinforcement Learning

Potential-based shaping was designed as a way of introducing background knowledge into model-free reinforcement-learning algorithms. By identifying states that are likely to have high value, this approach can decrease experience complexity--the number of trials needed to find near-optimal behavior. An orthogonal way of decreasing experience complexity is to use a model-based learning approach, building and exploiting an explicit transition model. In this paper, we show how potential-based shaping can be redefined to work in the model-based setting to produce an algorithm that shares the benefits of both ideas.

[1] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[2] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[3] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[4] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[5] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[6] S. Zilberstein,et al. Solving Markov Decision Problems Using Heuristic Search , 1999 .

[7] Dale Schuurmans,et al. Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs , 2002, ICML.

[8] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[9] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .

[10] Eric Wiewiora,et al. Potential-Based Shaping and Q-Value Initialization are Equivalent , 2003, J. Artif. Intell. Res..

[11] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[12] Alexander L. Strehl,et al. PAC Reinforcement Learning Bounds for RTDP and Rand-RTDP Technical Report , 2006 .

[13] Michael L. Littman,et al. Efficient Reinforcement Learning with Relocatable Action Models , 2007, AAAI.