Using Least-Square Policy Iteration to Online Optimize the Parameters of ALV's Speed Controller

Due to the highly non-linear properties of the longitudinal dynamics of autonomous land vehicles (ALVs), it is difficult to tune the parameters of a speed controller for the autonomous driving of ALVs. Aiming at this problem, in this paper, a novel learning-Based speed controller is proposed, which is composed of a time-varying proportional-integral (PI) control structure and a learning-Based learning module. A near-optimal policy is obtained by least-square policy iteration (LSPI), which is an approximate policy iteration method. The learning-Based module uses the near-optimal policy to realize online tuning of the PI coefficients. The simulation results show that the proposed controller can optimize the control performance by combining different non-optimal coefficients of the PI structure.

[1]  Wei-Bin Zhang,et al.  Demonstration of integrated longitudinal and lateral control for the operation of automated vehicles in platoons , 2000, IEEE Trans. Control. Syst. Technol..

[2]  Eam Khwang Teoh,et al.  Fuzzy speed and steering control of an AGV , 2002, IEEE Trans. Control. Syst. Technol..

[3]  Dewen Hu,et al.  Continuous-action reinforcement learning with fast policy search and adaptive basis function selection , 2011, Soft Comput..

[4]  Petros A. Ioannou,et al.  Longitudinal control of heavy trucks in mixed traffic: environmental and fuel economy considerations , 2006, IEEE Transactions on Intelligent Transportation Systems.

[5]  Luke Ng,et al.  Reinforcement Learning of Dynamic Collaborative Driving , 2008 .

[6]  Xin Xu,et al.  Kernel-Based Least Squares Policy Iteration for Reinforcement Learning , 2007, IEEE Transactions on Neural Networks.

[7]  Chi-Kwong Li,et al.  An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control , 2005, IEEE Transactions on Intelligent Transportation Systems.

[8]  Ephrahim Garcia,et al.  Team Cornell's Skynet: Robust perception and planning in an urban environment , 2008, J. Field Robotics.

[9]  Carlos Canudas-de-Wit,et al.  A Safe Longitudinal Control for Adaptive Cruise Control and Stop-and-Go Scenarios , 2007, IEEE Transactions on Control Systems Technology.

[10]  José Eugenio Naranjo,et al.  ACC+Stop&go maneuvers with throttle and brake fuzzy control , 2006, IEEE Transactions on Intelligent Transportation Systems.

[11]  Remo Pillat,et al.  A practical approach to robotic design for the DARPA Urban Challenge , 2008, J. Field Robotics.

[12]  Christian Berger,et al.  Caroline: An autonomously driving vehicle for urban environments , 2008, J. Field Robotics.

[13]  Luke Fletcher,et al.  A perception-driven autonomous urban vehicle , 2008 .

[14]  J. K. Hedrick,et al.  Vehicle Speed and Spacing Control Via Coordinated Throttle and Brake Actuation , 1996 .

[15]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[16]  Vicente Milanés Montero,et al.  Design and Implementation of a Neuro-Fuzzy System for Longitudinal Control of Autonomous Vehicles , 2022 .

[17]  Sebastian Thrun,et al.  Stanley: The robot that won the DARPA Grand Challenge , 2006, J. Field Robotics.

[18]  Daniel D. Lee,et al.  Little Ben: The Ben Franklin Racing Team's entry in the 2007 DARPA Urban Challenge , 2008, J. Field Robotics.

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Saeid Nahavandi,et al.  Adaptive cruise control look-ahead system for energy management of vehicles , 2012, Expert Syst. Appl..

[21]  Xiao-Yun Lu,et al.  Longitudinal control design and experiment for heavy-duty trucks , 2003, Proceedings of the 2003 American Control Conference, 2003..

[22]  Bing-Fei Wu,et al.  The Human-in-the-Loop Design Approach to the Longitudinal Automation System for an Intelligent Vehicle , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[23]  Rolf Isermann,et al.  Longitudinal and lateral control and supervision of autonomous intelligent vehicles , 1996 .

[24]  Luke Fletcher,et al.  A perception‐driven autonomous urban vehicle , 2008, J. Field Robotics.

[25]  David Anderson Splined Speed Control using SpAM (Speed-based Acceleration Maps) for an Autonomous Ground Vehicle , 2008 .

[26]  Charles Desjardins,et al.  Cooperative Adaptive Cruise Control: A Reinforcement Learning Approach , 2011, IEEE Transactions on Intelligent Transportation Systems.