Adaptive motor patterns and reflexes for bipedal locomotion on rough terrain

The Bio-inspired Behavior-Based Bipedal Locomotion Control (B4LC) system consists of control units encapsulating feed-forward and feedback mechanisms, namely motor patterns and reflexes. To optimize the performance of motor patterns and reflexes in terms of stable locomotion on both even and uneven terrains, we present a learning scheme embedded in the B4LC system. By combining the Particle Swarm Optimization (PSO) method and the Expectation-maximization based Reinforcement Learning (EM-RL) method, a learning unit is comprised of an optimization module and a learning module embedded in the hierarchical control structure. The optimization module optimizes the motor patterns at hip and ankle joints with respect to energy consumption, stability and velocity control. The learning module generates compensating torques against disturbances at the ankle joints by combining the basis function derived from state information and the policy parameters. The optimization and learning procedures are conducted on a simulated robot with 21 DoFs. The simulation results show that the robot with optimized motor patterns and learned reflexes performs a more robust and stable locomotion on even and uneven terrains.

[1]  Jan Peters,et al.  Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.

[2]  Chee-Meng Chew,et al.  A general control architecture for dynamic bipedal walking , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[3]  Weiwei Huang,et al.  Pattern generation for bipedal walking on slopes and stairs , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[4]  Hiroshi Shimizu,et al.  Self-organized control of bipedal locomotion by neural oscillators in unpredictable environment , 1991, Biological Cybernetics.

[5]  Darwin G. Caldwell,et al.  Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Chee-Meng Chew,et al.  A uniform biped gait generator with offline optimization and online adjustable parameters , 2007, Robotica.

[7]  Cord Niehaus,et al.  Gait Optimization on a Humanoid Robot using Particle Swarm Optimization , 2007 .

[8]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[9]  Stefan Schaal,et al.  Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Karsten Berns,et al.  Biologically motivated push recovery strategies for a 3D bipedal robot walking in complex environments , 2013, 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[11]  Luís Paulo Reis,et al.  Biped Walking Using Coronal and Sagittal Movements Based on Truncated Fourier Series , 2010, RoboCup.

[12]  Shin Ishii,et al.  Reinforcement learning for a biped robot based on a CPG-actor-critic method , 2007, Neural Networks.

[13]  J. Nielsen How we Walk: Central Control of Muscle Activity during Human Walking , 2003, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.

[14]  Prahlad Vadakkepat,et al.  Disturbance rejection by online ZMP compensation , 2008, Robotica.

[15]  Prahlad Vadakkepat,et al.  Genetic algorithm-based optimal bipedal walking gait synthesis considering tradeoff between stability margin and speed , 2009, Robotica.

[16]  Shin Ishii,et al.  Reinforcement learning for quasi-passive dynamic walking of an unstable biped robot , 2006, Robotics Auton. Syst..

[17]  Karsten Berns,et al.  Experimental verification of an approach for disturbance estimation and compensation on a simulated biped during perturbed stance , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Sven Behnke,et al.  Stochastic optimization of bipedal walking using gyro feedback and phase resetting , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[19]  Nikolaos G. Tsagarakis,et al.  Bipedal walking energy minimization by reinforcement learning with evolving policy parameterization , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Tobias Luksch,et al.  Human-like Control of Dynamically Walking Bipedal Robots , 2010 .