Complex-valued reinforcement learning with hierarchical architecture

Hierarchical complex-valued reinforcement learning is proposed in order to solve the perceptual aliasing problem. The perceptual aliasing problem is encountered when an incomplete set of sensors is used in an actual environment, and this problem makes learning difficult for an agent. Hierarchical Q-learning (HQ- learning) and complex-valued reinforcement learning are proposed in order to solve this problem. HQ-learning is a hierarchical extension of Q-learning. In HQ-learning, tasks are divided into sequences of simpler sub-tasks that can be solved by adopting memory-less policies, but a considerable amount of time is required for learning. In complex-valued reinforcement learning, the dependence of contexts can be represented by using complex-valued action-value functions. It enables the agent to adaptively perform actions, but may not deal problems because of the cycle of perceptual aliasing. In this paper, complex-valued reinforcement learning based on HQ-learning with a hierarchical design is proposed. Experimental results show the effectiveness of the proposed method.

[1]  Andrew McCallum,et al.  Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.

[2]  Daphne Koller,et al.  Reinforcement Learning Using Approximate Belief States , 1999, NIPS.

[3]  Jürgen Schmidhuber,et al.  HQ-Learning , 1997, Adapt. Behav..

[4]  Takeshi Shibuya,et al.  Complex-Valued Reinforcement Learning , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.