Self-scaling reinforcement learning for fuzzy logic controller-applications to motion control of two-link brachiation robot

In this paper, we propose a new reinforcement learning algorithm to generate a fuzzy controller for robot motions. This algorithm generates a range of continuous real-valued actions, and the reinforcement signal is self-scaled. This prevents the weights from overshooting when the system receives very large reinforcement values. Therefore, this algorithm can obtain a solution in fewer iterations. The proposed method is applied to the control of the brachiation robot, which moves dynamically from branch to branch like a gibbon swinging its body in a pendulum-like fashion. Through computer simulations, we show the fast convergence and the robustness against disturbances.

[1]  Fumihito Arai,et al.  A study on the brachiation type of mobile robot (heuristic creation of driving input and control using CMAC) , 1991, Proceedings IROS '91:IEEE/RSJ International Workshop on Intelligent Robots and Systems '91.

[2]  Vijaykumar Gullapalli,et al.  A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.

[3]  H. Nomura,et al.  A Self-Tuning Method of Fuzzy Reasoning By Genetic Algorithm , 1993 .

[4]  T. Fukuda,et al.  Brachiation type of mobile robot , 1991, Fifth International Conference on Advanced Robotics 'Robots in Unstructured Environments.

[5]  M.A. Lee,et al.  Integrating design stage of fuzzy systems using genetic algorithms , 1993, [Proceedings 1993] Second IEEE International Conference on Fuzzy Systems.

[6]  L. Darrell Whitley,et al.  Genetic algorithms and neural networks: optimizing connections and connectivity , 1990, Parallel Comput..

[7]  Yoshiki Uchikawa,et al.  An efficient finding of fuzzy rules using a new approach to genetic based machine learning , 1995, Proceedings of 1995 IEEE International Conference on Fuzzy Systems..

[8]  R. Katayama,et al.  Self generating radial basis function as neuro-fuzzy model and its application to nonlinear prediction of chaotic time series , 1993, [Proceedings 1993] Second IEEE International Conference on Fuzzy Systems.

[9]  Mark W. Spong,et al.  Swing up control of the Acrobot , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[10]  Lawrence. Davis,et al.  Handbook Of Genetic Algorithms , 1990 .

[11]  V. Gullapalli,et al.  Acquiring robot skills via reinforcement learning , 1994, IEEE Control Systems.

[12]  R. M. Goodman,et al.  Learning fuzzy rule-based neural networks for function approximation , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[13]  Fumihito Arai,et al.  Swing and locomotion control for a two-link brachiation robot , 1993, IEEE Control Systems.

[14]  Kazuo Yamafuji,et al.  Study of a mobile robot which can shift from one horizontal bar to another using vibratory excitation , 1992 .

[15]  Fumihito Arai,et al.  Swing and locomotion control for a two-link brachiation robot , 1994 .

[16]  T. Fukuda,et al.  RBF-fuzzy system with GA based unsupervised/supervised learning method , 1995, Proceedings of 1995 IEEE International Conference on Fuzzy Systems..