论文信息 - Abstracting Complex Domains Using Modular Object-Oriented Markov Decision Processes

Abstracting Complex Domains Using Modular Object-Oriented Markov Decision Processes

We present an initial proposal for modular object-oriented MDPs, an extension of OO-MDPs that abstracts complex domains that are partially observable and stochastic with multiple goals. Modes reduce the curse of dimensionality by reducing the number of attributes, objects, and actions into only the features relevant for each goal. These modes may also be used as an abstracted domain to be transferred to other modes or to another domain.

Marie desJardins | Shawn Squire | S. Squire | Marie desJardins

[1] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[2] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[3] Marie desJardins,et al. Portable Option Discovery for Automated Learning Transfer in Object-Oriented Markov Decision Processes , 2015, IJCAI.

[4] Sriraam Natarajan,et al. Dynamic preferences in multi-criteria reinforcement learning , 2005, ICML.

[5] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.

[6] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.