论文信息 - Q-learning in continuous state-action space with redundant dimensions by using a selective desensitization neural network

Q-learning in continuous state-action space with redundant dimensions by using a selective desensitization neural network

When applying reinforcement learning algorithms such as Q-learning to real world problems, we must consider the high and redundant dimensions and continuity of the state-action space. The continuity of state-action space is often treated by value function approximation. However, conventional function approximators such as radial basis function networks (RBFNs) are unsuitable in these environments, because they incur high computational cost, and the number of required experiences grows exponentially with the dimension of the state-action space. By contrast, selective desensitization neural network (SDNN) is highly robust to redundant inputs and computes at low computational cost. This paper proposes a novel function approximation method for Q-learning in continuous state-action space based on SDNN. The proposed method is evaluated by numerical experiments with redundant input(s). These experimental results validate the robustness of the proposed method to redundant state dimensions, and its lower computational cost than RBFN. These properties are advantageous to real-world applications such as robotic systems.

[1] Masahiko Morita,et al. Direct Estimation of Hand Motion Speed from Surface Electromyograms Using a Selective Desensitization Neural Network , 2014 .

[2] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[3] Mark W. Spong,et al. The swing up control problem for the Acrobot , 1995 .

[4] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[5] Alborz Geramifard,et al. Incremental Least-Squares Temporal Difference Learning , 2006, AAAI.

[6] Alexander Zelinsky,et al. Q-Learning in Continuous State and Action Spaces , 1999, Australian Joint Conference on Artificial Intelligence.

[7] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.

[10] Jooyoung Park,et al. Universal Approximation Using Radial-Basis-Function Networks , 1991, Neural Computation.

[11] Axel van Lamsweerde,et al. Learning machine learning , 1991 .