Comparison of Deep Reinforcement Learning and Model Predictive Control for Adaptive Cruise Control

This study compares Deep Reinforcement Learning (DRL) and Model Predictive Control (MPC) for Adaptive Cruise Control (ACC) design in car-following scenarios. A first-order system is used as the Control-Oriented Model (COM) to approximate the acceleration command dynamics of a vehicle. Based on the equations of the control system and the multi-objective cost function, we train a DRL policy using Deep Deterministic Policy Gradient (DDPG) and solve the MPC problem via Interior-Point Optimization (IPO). Simulation results for the episode costs show that, when there are no modeling errors and the testing inputs are within the training data range, the DRL solution is equivalent to MPC with a sufficiently long prediction horizon. Particularly, the DRL episode cost is only 5.8% higher than the benchmark solution provided by optimizing the entire episode via IPO. The DRL control performance degrades when the testing inputs are outside the training data range, indicating inadequate generalization. When there are modeling errors due to control delays, disturbances, and/or testing with a High-Fidelity Model (HFM) of the vehicle, the DRL-trained policy performs better with large modeling errors while having similar performance as MPC when the modeling errors are small.

[1]  Nasser L. Azad,et al.  Longitudinal Dynamic versus Kinematic Models for Car-Following Control Using Deep Reinforcement Learning , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  John McPhee,et al.  Nonlinear Model Predictive Control Reduction Using Truncated Single Shooting , 2018, 2018 Annual American Control Conference (ACC).

[4]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[5]  John McPhee,et al.  High-Fidelity Modeling of a Power-Split Plug-In Hybrid Electric Powertrain for Control Performance Evaluation , 2013 .

[6]  Marc Peter Deisenroth,et al.  Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control , 2017, AISTATS.

[7]  Charles Desjardins,et al.  Cooperative Adaptive Cruise Control: A Reinforcement Learning Approach , 2011, IEEE Transactions on Intelligent Transportation Systems.

[8]  Hong Liu,et al.  Model predictive control for adaptive cruise control with multi-objectives: comfort, fuel-economy, safety and car-following , 2010 .

[9]  Alois Knoll,et al.  Deep Reinforcement Learning for Predictive Longitudinal Control of Automated Vehicles , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[10]  Ulf Holmberg,et al.  A Modular CACC System Integration and Design , 2012, IEEE Transactions on Intelligent Transportation Systems.

[11]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[12]  David Silver,et al.  Learning values across many orders of magnitude , 2016, NIPS.

[13]  Hans Joachim Ferreau,et al.  Efficient Numerical Methods for Nonlinear MPC and Moving Horizon Estimation , 2009 .

[14]  Henk Nijmeijer,et al.  Cooperative Driving With a Heavy-Duty Truck in Mixed Traffic: Experimental Results , 2012, IEEE Transactions on Intelligent Transportation Systems.

[15]  N. Kemal Ure,et al.  Enhancing Situational Awareness and Performance of Adaptive Cruise Control through Model Predictive Control and Deep Reinforcement Learning , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[16]  Jonathan P. How,et al.  Dynamic Tube MPC for Nonlinear Systems , 2019, 2019 American Control Conference (ACC).

[17]  Nathan van de Wouw,et al.  Design and experimental evaluation of cooperative adaptive cruise control , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[18]  Mario Zanon,et al.  Data-Driven Economic NMPC Using Reinforcement Learning , 2019, IEEE Transactions on Automatic Control.

[19]  Namwook Kim,et al.  Comparison between Rule-Based and Instantaneous Optimization for a Single-Mode, Power-Split HEV , 2011 .

[20]  Behdad Chalaki,et al.  Simulation to scaled city: zero-shot policy transfer for traffic control via autonomous vehicles , 2018, ICCPS.

[21]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[22]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[23]  Maxim Likhachev,et al.  Driving in Dense Traffic with Model-Free Reinforcement Learning , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[25]  Nasser L. Azad,et al.  Adaptive Tube-Based Nonlinear MPC for Economic Autonomous Cruise Control of Plug-In Hybrid Electric Vehicles , 2018, IEEE Transactions on Vehicular Technology.

[26]  Robert Babuška,et al.  On-line Reinforcement Learning for Nonlinear Motion Control: Quadratic and Non-Quadratic Reward Functions , 2014 .

[27]  Nolan Wagener,et al.  Information theoretic MPC for model-based reinforcement learning , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Henk Wymeersch,et al.  Design and Experimental Validation of a Cooperative Driving System in the Grand Cooperative Driving Challenge , 2012, IEEE Transactions on Intelligent Transportation Systems.

[29]  Meixin Zhu,et al.  Human-Like Autonomous Car-Following Model with Deep Reinforcement Learning , 2018, Transportation Research Part C: Emerging Technologies.

[30]  J. A. Rossiter,et al.  Model-Based Predictive Control : A Practical Approach , 2017 .

[31]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[32]  Sham M. Kakade,et al.  Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.

[33]  Azim Eskandarian,et al.  Handbook of Intelligent Vehicles , 2012 .

[34]  Bin Wang,et al.  A supervised Actor–Critic approach for adaptive cruise control , 2013, Soft Comput..

[35]  John McPhee,et al.  Improving model predictive controller turnaround time using restricted Lagrangians , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[36]  Moritz Diehl,et al.  CasADi: a software framework for nonlinear optimization and optimal control , 2018, Mathematical Programming Computation.

[37]  Sergey Levine,et al.  Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Sadegh Tajeddin,et al.  Ecological Adaptive Cruise Control With Optimal Lane Selection in Connected Vehicle Environments , 2020, IEEE Transactions on Intelligent Transportation Systems.

[39]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[40]  Bilin Aksun Güvenç,et al.  Cooperative Adaptive Cruise Control Implementation of Team Mekar at the Grand Cooperative Driving Challenge , 2012, IEEE Transactions on Intelligent Transportation Systems.

[41]  Sen Wang,et al.  Deep Reinforcement Learning for Autonomous Driving , 2018, ArXiv.

[42]  Nazim Kemal Ure,et al.  Automated Lane Change Decision Making using Deep Reinforcement Learning in Dynamic and Uncertain Highway Environment , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[43]  Alan Sullivan,et al.  Sim-to-Real Transfer Learning using Robustified Controllers in Robotic Tasks involving Complex Dynamics , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[44]  Jianqiang Wang,et al.  Model Predictive Multi-Objective Vehicular Adaptive Cruise Control , 2011, IEEE Transactions on Control Systems Technology.

[45]  Pieter Abbeel,et al.  Constrained Policy Optimization , 2017, ICML.

[46]  Nasser L. Azad,et al.  Ecological Adaptive Cruise Controller for Plug-In Hybrid Electric Vehicles Using Nonlinear Model Predictive Control , 2016, IEEE Transactions on Intelligent Transportation Systems.

[47]  Louis Wehenkel,et al.  Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[48]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[49]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[50]  Yang Zheng,et al.  Dynamical Modeling and Distributed Control of Connected and Automated Vehicles: Challenges and Opportunities , 2017, IEEE Intelligent Transportation Systems Magazine.

[51]  Harald Waschl,et al.  Flexible Spacing Adaptive Cruise Control Using Stochastic Model Predictive Control , 2018, IEEE Transactions on Control Systems Technology.

[52]  Mykel J. Kochenderfer,et al.  Distributed Wildfire Surveillance with Autonomous Aircraft using Deep Reinforcement Learning , 2018, Journal of Guidance, Control, and Dynamics.

[53]  Damien Ernst,et al.  How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies , 2015, ArXiv.