论文信息 - Learning Throttle Valve Control Using Policy Search

Learning Throttle Valve Control Using Policy Search

The throttle valve is a technical device used for regulating a fluid or a gas flow. Throttle valve control is a challenging task, due to its complex dynamics and demanding constraints for the controller. Using state-of-the-art throttle valve control, such as model-free PID controllers, time-consuming and manual adjusting of the controller is necessary. In this paper, we investigate how reinforcement learning (RL) can help to alleviate the effort of manual controller design by automatically learning a control policy from experiences. In order to obtain a valid control policy for the throttle valve, several constraints need to be addressed, such as no-overshoot. Furthermore, the learned controller must be able to follow given desired trajectories, while moving the valve from any start to any goal position and, thus, multi-targets policy learning needs to be considered for RL. In this study, we employ a policy search RL approach, Pilco [2], to learn a throttle valve control policy. We adapt the Pilco algorithm, while taking into account the practical requirements and constraints for the controller. For evaluation, we employ the resulting algorithm to solve several control tasks in simulation, as well as on a physical throttle valve system. The results show that policy search RL is able to learn a consistent control policy for complex, real-world systems.

[1] I. J. Leontaritis,et al. Input-output parametric models for non-linear systems Part II: stochastic non-linear systems , 1985 .

[2] Wei Sun,et al. Neural Network Based Self-Learning Control Strategy for Electronic Throttle Valve , 2010, IEEE Transactions on Vehicular Technology.

[3] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[4] Shibly Ahmed Al-Samarraie,et al. Design of Electronic Throttle Valve Position Control System using Nonlinear PID Controller , 2012 .

[5] P. Mercorelli,et al. Throttle valve control using an inverse local linear model tree based on a fuzzy neural network , 2008, 2008 7th IEEE International Conference on Cybernetic Intelligent Systems.

[6] Claudio Garcia,et al. Comparison of friction models applied to a control valve , 2008 .

[7] Yaonan Wang,et al. SVM-Based Approximate Model Control for Electronic Throttle Valve , 2008, IEEE Transactions on Vehicular Technology.

[8] Neil D. Lawrence,et al. Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.

[9] Marco Wiering,et al. Reinforcement Learning , 2014, Adaptation, Learning, and Optimization.

[10] Yaonan Wang,et al. Harmony search algorithm-based fuzzy-PID controller for electronic throttle valve , 2013, Neural Computing and Applications.

[11] Carl E. Rasmussen,et al. Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.

[12] D Fox,et al. Multiple-Target Reinforcement Learning with a Single Policy , 2011 .

[13] Jun Nakanishi,et al. A Bayesian Approach to Nonlinear Parameter Identification for Rigid Body Dynamics , 2006, Robotics: Science and Systems.

[14] Peter Vrancx,et al. Reinforcement Learning: State-of-the-Art , 2012 .

[15] Marc Peter Deisenroth,et al. Efficient reinforcement learning using Gaussian processes , 2010 .

[16] Paul G. Griffiths. Embedded Software Control Design for an Electronic Throttle Body , 2002 .

[17] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.