A supervised Actor–Critic approach for adaptive cruise control

A novel supervised Actor–Critic (SAC) approach for adaptive cruise control (ACC) problem is proposed in this paper. The key elements required by the SAC algorithm namely Actor and Critic, are approximated by feed-forward neural networks respectively. The output of Actor and the state are input to Critic to approximate the performance index function. A Lyapunov stability analysis approach has been presented to prove the uniformly ultimate bounded property of the estimation errors of the neural networks. Moreover, we use the supervisory controller to pre-train Actor to achieve a basic control policy, which can improve the training convergence and success rate. We apply this method to learn an approximate optimal control policy for the ACC problem. Experimental results in several driving scenarios demonstrate that the SAC algorithm performs well, so it is feasible and effective for the ACC problem.

[1]  Dirk Helbing,et al.  Jam-Avoiding Adaptive Cruise Control (ACC) and its Impact on Traffic Dynamics , 2005 .

[2]  G.N. Bifulco,et al.  Experiments toward an human-like Adaptive Cruise Control , 2008, 2008 IEEE Intelligent Vehicles Symposium.

[3]  Michael T. Rosenstein,et al.  Supervised Actor‐Critic Reinforcement Learning , 2012 .

[4]  Naira Hovakimyan,et al.  Neural Network Adaptive Control for a Class of Nonlinear Uncertain Dynamical Systems With Asymptotic Stability Guarantees , 2008, IEEE Transactions on Neural Networks.

[5]  Bilin Aksun Güvenç,et al.  Model Predictive Adaptive Cruise Control , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[6]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[7]  Michael Schreckenberg,et al.  Traffic and Granular Flow ' 05 , 2007 .

[8]  Sarangapani Jagannathan,et al.  Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence , 2009, Neural Networks.

[9]  Feng Gao,et al.  Practical String Stability of Platoon of Adaptive Cruise Control Vehicles , 2011, IEEE Transactions on Intelligent Transportation Systems.

[10]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[11]  Werner Schiehlen,et al.  Nonlinear ACC in Simulation and Measurement , 2001 .

[12]  Emre Kural,et al.  Adaptive cruise control simulator , 2006 .

[13]  Dongbin Zhao,et al.  Self-teaching adaptive dynamic programming for Gomoku , 2012, Neurocomputing.

[14]  M. Evans,et al.  Traffic and Granular Flow ' 05 , 2007 .

[15]  Warren B. Powell,et al.  Reinforcement Learning and Its Relationship to Supervised Learning , 2004 .

[16]  Dongbin Zhao,et al.  Adaptive Cruise Control Based on Reinforcement Leaning with Shaping Rewards , 2011, Journal of Advanced Computational Intelligence and Intelligent Informatics.

[17]  Jing Xu,et al.  DHP Method for Ramp Metering of Freeway Traffic , 2011, IEEE Transactions on Intelligent Transportation Systems.

[18]  José Eugenio Naranjo,et al.  ACC+Stop&go maneuvers with throttle and brake fuzzy control , 2006, IEEE Transactions on Intelligent Transportation Systems.

[19]  Tommy W. S. Chow,et al.  On the theoretical and computational analysis between Trace Ratio LDA and null-space LDA , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[20]  Haibo He,et al.  Neural and fuzzy dynamic programming for under-actuated systems , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[21]  Jennie Si,et al.  Online learning control by association and reinforcement , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[22]  Seungwuk Moon,et al.  Design, tuning, and evaluation of a full-range adaptive cruise control system with collision avoidance , 2009 .

[23]  Carlos Canudas-de-Wit,et al.  A Safe Longitudinal Control for Adaptive Cruise Control and Stop-and-Go Scenarios , 2007, IEEE Transactions on Control Systems Technology.

[24]  Dongbin Zhao,et al.  Full-range adaptive cruise control based on supervised adaptive dynamic programming , 2014, Neurocomputing.

[25]  Vicente Milanés Montero,et al.  Comparing Fuzzy and Intelligent PI Controllers in Stop-and-Go Manoeuvres , 2012, IEEE Transactions on Control Systems Technology.

[26]  Jianqiang Wang,et al.  Model Predictive Multi-Objective Vehicular Adaptive Cruise Control , 2011, IEEE Transactions on Control Systems Technology.

[27]  Tao Li,et al.  Adaptive dynamic neuro-fuzzy system for traffic signal control , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[28]  Hiroshi Ohno Analysis and modeling of human driving behaviors using adaptive cruise control , 2001, Appl. Soft Comput..

[29]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[30]  Zhao Dongbin,et al.  Hybrid feedback control of vehicle longitudinal acceleration , 2012, Proceedings of the 31st Chinese Control Conference.

[31]  Feng Liu,et al.  A boundedness result for the direct heuristic dynamic programming , 2012, Neural Networks.

[32]  Andreas Tapani,et al.  Vehicle Trajectory Effects of Adaptive Cruise Control , 2012, J. Intell. Transp. Syst..

[33]  Kyongsu Yi,et al.  A driver-adaptive stop-and-go Cruise control strategy , 2004, IEEE International Conference on Networking, Sensing and Control, 2004.

[34]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.