Effectiveness of Considering State Similarity for Reinforcement Learning

This paper presents a novel approach that locates states with similar sub-policies, and incorporates them into the reinforcement learning framework for better learning performance. This is achieved by identifying common action sequences of states, which are derived from possible optimal policies and reflected into a tree structure. Based on the number of such sequences, we define a similarity function between two states, which helps to reflect updates on the action-value function of a state to all similar states. This way, experience acquired during learning can be applied to a broader context. The effectiveness of the method is demonstrated empirically.

[1]  Shie Mannor,et al.  Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.

[2]  Faruk Polat,et al.  Option Discovery in Reinforcement Learning using Frequent Common Subsequences of Actions , 2005, International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06).

[3]  Erol Şahin,et al.  Blind Area Measurement with Mobile Robots , 2003 .

[4]  Reda Alhajj,et al.  Learning by Automatic Option Discovery from Conditionally Terminating Sequences , 2006, ECAI.

[5]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[6]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[7]  Emre Ugur,et al.  Area measurement of large closed regions with a mobile robot , 2006, Auton. Robots.

[8]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[9]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[10]  Reda Alhajj,et al.  State Similarity Based Approach for Improving Performance in RL , 2007, IJCAI.

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[13]  Alicia P. Wolfe,et al.  Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.

[14]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[15]  Andrew G. Barto,et al.  Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.

[16]  Reda Alhajj,et al.  Positive Impact of State Similarity on Reinforcement Learning Performance , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[17]  Doina Precup,et al.  Learning Options in Reinforcement Learning , 2002, SARA.