Reinforcement Learning Compensation based PD Control for a Double Inverted Pendulum

In this paper, we present a Control Algorithm based on Reinforcement Learning for a double inverted pendulum on a cart. By implementing the Q-Learning techniques in the PD control scheme, the second pendulum (top pendulum) is enabled to improve its performance. In a first step, Q-Learning is used so that the control can balance the second pendulum towards its inverted vertical position, while the first pendulum has no restrictions on its movement and also the car remains in a range of ±1 meter in its displacement. In a second step, we combine hybrid techniques of Q-Learning and PD control, in a system that has had changes in its parameters and in its initial conditions. Then, with the hybrid control, we obtain better results than using the controllers individually. Finally, the simulation results show the effectiveness of the proposed controller.

[1]  Yu Zheng,et al.  Active Exploration Planning in Reinforcement Learning for Inverted Pendulum System Control , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[2]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[3]  Mark W. Spong,et al.  Robot dynamics and control , 1989 .

[4]  Frank L. Lewis,et al.  Neural Network Control of Robot Manipulators , 1996, IEEE Expert.

[5]  Zhicong Huang,et al.  Adaptive impedance control of robotic exoskeletons using reinforcement learning , 2016, 2016 International Conference on Advanced Robotics and Mechatronics (ICARM).

[6]  Athanasios S. Polydoros,et al.  Survey of Model-Based Reinforcement Learning: Applications on Robotics , 2017, J. Intell. Robotic Syst..

[7]  Lydia Tapia,et al.  Reinforcement learning for balancing a flying inverted pendulum , 2014, Proceeding of the 11th World Congress on Intelligent Control and Automation.

[8]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[9]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Marcela Jamett,et al.  Electrohydraulic Active Suspension Fuzzy-Neural Based Control System , 2018, IEEE Latin America Transactions.

[11]  Cristian Duran-Faundez,et al.  Vehicle Following Problem: A Control Approach for Uncertain Systems with Lossy Networks , 2018, IEEE Latin America Transactions.

[12]  Siwei Luo,et al.  Control Double Inverted Pendulum by Reinforcement Learning with Double CMAC Network , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[13]  Markus Hehn,et al.  A flying inverted pendulum , 2011, 2011 IEEE International Conference on Robotics and Automation.

[14]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[15]  Kazushi Nakano,et al.  A reward allocation method for reinforcement learning in stabilizing control tasks , 2014, Artificial Life and Robotics.

[16]  Liu Yongxin,et al.  Design of reinforce learning control algorithm and verified in inverted pendulum , 2015, 2015 34th Chinese Control Conference (CCC).

[17]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[18]  Catholijn M. Jonker,et al.  Emotion in reinforcement learning agents and robots: a survey , 2017, Machine Learning.

[19]  SRIDHAR MAHADEVAN,et al.  Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results , 2005, Machine Learning.

[20]  Frank L. Lewis,et al.  Multilayer neural-net robot controller with guaranteed tracking performance , 1996, IEEE Trans. Neural Networks.

[21]  Shie Mannor,et al.  Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..

[22]  K. Nakano,et al.  A reward allocation method for reinforcement learning in stabilizing control of T-inverted pendulum , 2012, 2012 9th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology.