论文信息 - Fast and Stable Learning in Direct-Vision-Based Reinforcement learning

Fast and Stable Learning in Direct-Vision-Based Reinforcement learning

Direct-Vision-Based Reinforcement Learning has been proposed not only for the motion planning but for the learning of the whole process from sensors to motors in robots, including recognition, attention and so on. In this learning, raw visual sensory signals are put into a layered neural network directly, and the network is trained by the training signals generated based on reinforcement learning. On the other hand, it has been pointed out that the combination of neural network and TD-type reinforcement learning sometimes leads to instability of learning. In this paper, it is shown that each visual sensory cell makes a role of localization of our continuous 3-dimensional space and it helps the learning to be fast and stable. Further by processing the localized input signals in the layered neural network, a global representation is reconstructed adaptively in the hidden layer through learning as shown in the previous papers.

Katsunari Shibata | M. Sugisaka | K. Ito

[1] C.W. Anderson,et al. Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.

[2] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.

[3] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[4] Katsunari Shibata,et al. Gauss-sigmoid neural network , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[5] K. Shibata,et al. Hand Reaching Movement Acquired through Reinforcement Learning , 2000 .

[6] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[7] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .

[8] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[9] Katsunari Shibata,et al. Reinforcement learning when visual sensory signals are directly given as inputs , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).