论文信息 - Research on task decomposition and state abstraction in reinforcement learning

Research on task decomposition and state abstraction in reinforcement learning

Task decomposition and State abstraction are crucial parts in reinforcement learning. It allows an agent to ignore aspects of its current states that are irrelevant to its current decision, and therefore speeds up dynamic programming and learning. This paper presents the SVI algorithm that uses a dynamic Bayesian network model to construct an influence graph that indicates relationships between state variables. SVI performs state abstraction for each subtask by ignoring irrelevant state variables and lower level subtasks. Experiment results show that the decomposition of tasks introduced by SVI can significantly accelerate constructing a near-optimal policy. This general framework can be applied to a broad spectrum of complex real world problems such as robotics, industrial manufacturing, games and others.

[1] Sridhar Mahadevan,et al. Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.

[2] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .

[3] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[4] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.

[5] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.

[6] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[7] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[8] Andrew G. Barto,et al. Reinforcement learning , 1998 .

[9] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[10] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[11] Andrew G. Barto,et al. A causal approach to hierarchical decomposition of factored MDPs , 2005, ICML.