论文信息 - Hierarchical Strategy Learning with Hybrid Representations

Hierarchical Strategy Learning with Hybrid Representations

Good problem solving knowledge for real life domains is hard to define in a single representation. In some situations, a direct policy is a better choice while in others, value function is better. Typically, direct policy representation is better suited to strategic level plans, while value function representation is better suited to tactical level plans. We propose a hybrid hierarchical representation machine (HHRM) where direct policy representation and value function based representation can co-exist in a level-wise fashion. We provide simple learning and planning algorithms with our new representation and discuss their application to Airspace Deconfliction domain. In our experiments, we provided our system LSP with two level HHRM for the domain. LSP could successfully learn from limited number of experts’ solution traces and show superior performance compared to average of human novice learners.

Subbarao Kambhampati | Sungwook Yoon

[1] Hector Muñoz-Avila,et al. SHOP: Simple Hierarchical Ordered Planner , 1999, IJCAI.

[2] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[3] Eduardo F. Morales,et al. Learning to fly by combining reinforcement learning with behavioural cloning , 2004, ICML.

[4] C. Guestrin,et al. Solving Factored MDPs with Hybrid State and Action Variables , 2006, J. Artif. Intell. Res..

[5] David W. Aha,et al. CaMeL: Learning Method Preconditions for HTN Planning , 2002, AIPS.

[6] Robert Givan,et al. Learning Measures of Progress for Planning Domains , 2005, AAAI.

[7] Damminda Alahakoon,et al. Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[8] Robert Givan,et al. Inductive Policy Selection for First-Order MDPs , 2002, UAI.

[9] David J. Musliner,et al. A Framework for Planning in Continuous-time Stochastic Domains , 2003, ICAPS.

[10] David Andre,et al. Programmable Reinforcement Learning Agents , 2000, NIPS.

[11] Jadzia Cendrowska,et al. PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..