论文信息 - State and action space construction using vision information

State and action space construction using vision information

To apply reinforcement learning to the real world, it needs pre-processed sensor data which is adequate for action learning. Since it is difficult to construct state space and learn an appropriate action simultaneously, we assume that an estimation is given to each step of action, whether it is good or bad. Under this condition, we propose a method of dividing and clustering the state space. The TRN (topology representing network) is a vector quantization algorithm, and it can preserve topology in the input space. We apply the TRN algorithm to our problem with dynamically increasing nodes and the idea of a radial basis function.

Jun Ota | Yuichi Kobayashi | T. Arai | Kousuke Inoue

[1] Hiroshi Ishiguro,et al. Robot oriented state space construction , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[2] Takashi Omori,et al. Adaptive State Space Formation Method for Reinforcement Learning. , 1999 .

[3] Jun Morimoto,et al. Conference on Intelligent Robots and Systems Reinforcement Le,arning of Dynamic Motor Sequence: Learning to Stand Up , 2022 .

[4] Thomas Martinetz,et al. Topology representing networks , 1994, Neural Networks.

[5] Gerald Sommer,et al. Dynamic Cell Structure Learns Perfectly Topology Preserving Map , 1995, Neural Computation.

[6] Katsunari Shibata,et al. Reinforcement learning when visual sensory signals are directly given as inputs , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[7] Takashi Omori,et al. Adaptive State Space Formation in Reinforcement Learning , 1998, ICONIP.

[8] Gerald Sommer,et al. An integrated architecture for learning of reactive behaviors based on dynamic cell structures , 1997, Robotics Auton. Syst..

[9] Minoru Asada,et al. Action-based sensor space categorization for robot learning , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[10] Teuvo Kohonen,et al. Self-Organizing Maps , 2010 .