Experimental study of the eligibility traces in complex valued reinforcement learning

Effectiveness of eligibility traces in complex valued reinforcement learning is studied. Complex valued reinforcement learning is a new method inspired by complex valued neural networks. In this study, it is desired that various approaches in the ordinally real valued reinforcement learning are applied to the complex valued reinforcement learning. This paper focuses attention on an experimental study of the eligibility traces. Simulation results infer that there is a possibility of overcoming tight perceptual aliasing with long trace back up.

[1]  Takeshi Shibuya,et al.  Complex-Valued Reinforcement Learning , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[2]  Akira Hirose,et al.  Complex-Valued Neural Networks: Theories and Applications , 2003 .

[3]  Hironori Hirata,et al.  State space partitioning and clustering with sensor alignment for autonomous robots , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[4]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[5]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[6]  H. Hirata,et al.  Reinforcement learning to compensate for perceptual aliasing using dynamic additional parameter: motivational value , 2002, IEEE International Conference on Systems, Man and Cybernetics.

[7]  Hironori Hirata,et al.  Development of intelligent wheelchair acquiring autonomous, cooperative, and collaborative behavior , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[8]  Abhijit Gosavi,et al.  Reinforcement Learning: A Tutorial Survey and Recent Advances , 2009, INFORMS J. Comput..

[9]  Peter Stone,et al.  Learning Predictive State Representations , 2003, ICML.

[10]  Andrew McCallum,et al.  Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.

[11]  Jürgen Schmidhuber,et al.  HQ-Learning , 1997, Adapt. Behav..