Improving Input-Output Linearizing Controllers for Bipedal Robots via Reinforcement Learning

The main drawbacks of input-output linearizing controllers are the need for precise dynamics models and not being able to account for input constraints. Model uncertainty is common in almost every robotic application and input saturation is present in every real world system. In this paper, we address both challenges for the specific case of bipedal robot control by the use of reinforcement learning techniques. Taking the structure of a standard input-output linearizing controller, we use an additive learned term that compensates for model uncertainty. Moreover, by adding constraints to the learning problem we manage to boost the performance of the final controller when input limits are present. We demonstrate the effectiveness of the designed framework for different levels of uncertainty on the five-link planar walking robot RABBIT.

[1]  Koushil Sreenath,et al.  Rapidly Exponentially Stabilizing Control Lyapunov Functions and Hybrid Zero Dynamics , 2014, IEEE Transactions on Automatic Control.

[2]  Aaron D. Ames,et al.  FROST∗: Fast robot optimization and simulation toolkit , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3]  S. Sastry Nonlinear Systems: Analysis, Stability, and Control , 1999 .

[4]  J. Grizzle,et al.  A Restricted Poincaré Map for Determining Exponentially Stable Periodic Orbits in Systems with Impulse Effects: Application to Bipedal Robots , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[5]  Koushil Sreenath,et al.  A Compliant Hybrid Zero Dynamics Controller for Stable, Efficient and Fast Bipedal Walking on MABEL , 2011, Int. J. Robotics Res..

[6]  E. Westervelt,et al.  ZERO DYNAMICS OF UNDERACTUATED PLANAR BIPED WALKERS , 2002 .

[7]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[8]  S. Shankar Sastry,et al.  Adaptive Control of Mechanical Manipulators , 1987, Proceedings. 1986 IEEE International Conference on Robotics and Automation.

[9]  Franck Plestan,et al.  Asymptotically stable walking for biped robots: analysis via systems with impulse effects , 2001, IEEE Trans. Autom. Control..

[10]  S. Sastry,et al.  Adaptive Control: Stability, Convergence and Robustness , 1989 .

[11]  Koushil Sreenath,et al.  Torque Saturation in Bipedal Robotic Walking Through Control Lyapunov Function-Based Quadratic Programs , 2013, IEEE Access.

[12]  A. Isidori,et al.  Adaptive control of linearizable systems , 1989 .

[13]  S. Shankar Sastry,et al.  Feedback Linearization for Unknown Systems via Reinforcement Learning , 2019, ArXiv.

[14]  Christine Chevallereau,et al.  RABBIT: a testbed for advanced control theory , 2003 .

[15]  E. Westervelt,et al.  Feedback Control of Dynamic Bipedal Robot Locomotion , 2007 .

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  Aaron D. Ames,et al.  Episodic Learning with Control Lyapunov Functions for Uncertain Robotic Systems* , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[18]  E. Westervelt,et al.  Toward a coherent framework for the control of planar biped locomotion , 2003 .

[19]  Koushil Sreenath,et al.  L1 adaptive control for bipedal robots with control Lyapunov function based quadratic programs , 2015, 2015 American Control Conference (ACC).