Design and Implementation of Fuzzy Policy Gradient Gait Learning Method for Walking Pattern Generation of Humanoid Robots

The design and implementation of Fuzzy Policy Gradient Learning (FPGL) method for humanoid robot is proposed in this paper. This paper not only introduces the phases of the humanoid robot walking, but also improves and parameterizes the gait pattern of the robot. FPGL is an integrated machine learning method based on Policy Gradient Reinforcement Learning (PGRL) and fuzzy logic concept in order to improve the efficiency and speed of gait learning computation. The result of the experiment shows that FPGL method can train the gait pattern from 9.26 mm/s walking speed to 162.27 mm/s within an hour. The training data of experiments also shows that this method could improve the efficiency of basic PGRL method up to 13%. The effect of arm movement to reduce the tilt of the trunk is also proved by the experimental results. All the results successfully demonstrate the feasibility and the flexibility of the proposed method.

[1]  Giuseppe Oriolo,et al.  Policy gradient learning for a humanoid soccer robot , 2009, Robotics Auton. Syst..

[2]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  M. Er,et al.  Self-constructing Fuzzy Neural Networks with Extended Kalman Filter , 2010 .

[4]  Guy Bessonnet,et al.  Forces acting on a biped robot. Center of pressure-zero moment point , 2004, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[5]  J. Morimoto,et al.  A Biologically Inspired Biped Locomotion Strategy for Humanoid Robots: Modulation of Sinusoidal Patterns by a Coupled Oscillator Model , 2008, IEEE Transactions on Robotics.

[6]  James U. Korein,et al.  Robotics , 2018, IBM Syst. J..

[7]  Shih-Hung Yang,et al.  Intelligent Forecasting System Using Grey Model Combined with Neural Network , 2011 .

[8]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[9]  Ching-Chang Wong,et al.  Design and Implementation of Vision-Based Fuzzy Obstacle Avoidance Method on Humanoid Robot , 2011 .

[10]  Andrea Cherubini,et al.  An extended policy gradient algorithm for robot task learning , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Z. Youcef,et al.  Control of the trajectory of a hexapod robot based on distributed Q-learning , 2004, 2004 IEEE International Symposium on Industrial Electronics.

[12]  Cheng-Jian Lin,et al.  Protein 3D HP Model Folding Simulation Using a Hybrid of Genetic Algorithm and Particle Swarm Optimization , 2011 .

[13]  Ming-Shyan Wang,et al.  Fuzzy Logic Control Design for a Stair-Climbing Robot , 2009 .

[14]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[15]  Shuuji Kajita,et al.  ZMP-Based Biped Running Control , 2007, IEEE Robotics & Automation Magazine.

[16]  Peter Stone,et al.  Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[17]  Tzuu-Hseng S. Li,et al.  Design of interval type-2 fuzzy sliding-mode controller , 2008, Inf. Sci..

[18]  Tzuu-Hseng S. Li,et al.  Walking Motion Generation, Synthesis, and Control for Biped Robot by Using PGRL, LPI, and Fuzzy Logic , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[19]  Qiang Huang,et al.  Sensory reflex control for humanoid walking , 2005, IEEE Transactions on Robotics.

[20]  M. Shieh,et al.  An Optimized Neuro-Fuzzy Controller Design for Bipedal Locomotion , 2009 .

[21]  Jong-Hwan Kim,et al.  Landing Force Control for Humanoid Robot by Time-Domain Passivity Approach , 2007, IEEE Transactions on Robotics.

[22]  Youngjoon Han,et al.  Fuzzy Controller based Biped Robot Balance Control using 3D Image , 2009 .