论文信息 - Coarticulation in Markov Decision Processes

Coarticulation in Markov Decision Processes

We investigate an approach for simultaneously committing to multiple activities, each modeled as a temporally extended action in a semi-Markov decision process (SMDP). For each activity we define a set of admissible solutions consisting of the redundant set of optimal policies, and those policies that ascend the optimal state-value function associated with them. A plan is then generated by merging them in such a way that the solutions to the subordinate activities are realized in the set of admissible solutions satisfying the superior activities. We present our theoretical results and empirically evaluate our approach in a simulated domain.

[1] Robert Platt,et al. Nullspace composition of control laws for grasping , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2] A. Barto,et al. An algebraic approach to abstraction in reinforcement learning , 2004 .

[3] A. Pellionisz,et al. Tensor network theory of the metaorganization of functional geometries in the central nervous system , 1985, Neuroscience.

[4] Andrew G. Barto,et al. Lyapunov-Constrained Action Sets for Reinforcement Learning , 2001, ICML.

[5] Ronen I. Brafman,et al. Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.

[6] Roderic A. Grupen,et al. A hybrid architecture for adaptive robot control , 2000 .

[7] Peter L. Bartlett,et al. Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.

[8] Yoshihiko Nakamura,et al. Advanced robotics - redundancy and optimization , 1990 .

[9] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.

[10] Geoffrey J. Gordon,et al. Distributed Planning in Hierarchical Factored MDPs , 2002, UAI.

[11] Roderic A. Grupen,et al. A control basis for multilegged walking , 1996, Proceedings of IEEE International Conference on Robotics and Automation.