Heuristic action execution for energy efficient charge-sustaining control of connected hybrid vehicles with model-free double Q-learning

Abstract This paper investigates a model-free supervisory control methodology with double Q-learning for the hybrid vehicle in charge-sustaining scenarios. It aims to improve the vehicle’s energy efficiency continuously while maintaining the battery’s state-of-charge in real-world driving. Two new heuristic action execution policies, the max-value-based policy and the random policy, are proposed for the double Q-learning method to reduce overestimation of the merit-function values for each action in power-split control of the vehicle. Experimental studies based on software-in-the-loop (offline learning) and hardware-in-the-loop (online learning) platforms are carried out to explore the potential of energy-saving in four driving cycles defined with real-world vehicle operations. The results from 35 rounds of offline undisturbed learning show that the heuristic action execution policies can improve the learning performance of conventional double Q-learning by achieving at least 1.09% higher energy efficiency. The proposed methods achieve similar results obtained by dynamic programming, but they have the capability of real-time online application. Double Q-learnings are shown more robust to turbulence during the disturbed learning: they realise at least three times improvement in energy efficiency compared to the standard Q-learning. Random execution policy achieves 1.18% higher energy efficiency than the max-value-based policy for the same driving condition. Significant tests show that deciding factor in the random execution policy has little impact on learning performance. By implementing the control strategies for online learning, the proposed model-free control method can save energy by more than 4.55% in the predefined real-world driving conditions compared to the method using standard Q-learning.

[1]  Yi Zhang,et al.  Human-like Autonomous Vehicle Speed Control by Deep Reinforcement Learning with Double Q-Learning , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[2]  Hongqiang Guo,et al.  State-of-charge-constraint-based energy management strategy of plug-in hybrid electric vehicle with bus route , 2019, Energy Conversion and Management.

[3]  Quan Zhou,et al.  Intelligent sizing of a series hybrid electric power-train system based on Chaos-enhanced accelerated particle swarm optimization , 2017 .

[4]  Ottorino Veneri,et al.  Power architectures for the integration of photovoltaic generation systems in DC-microgrids , 2019 .

[5]  Yingfeng Cai,et al.  A comprehensive dynamic efficiency-enhanced energy management strategy for plug-in hybrid electric vehicles , 2019, Applied Energy.

[6]  Shashi Shekhar,et al.  Actor-Critic based Deep Reinforcement Learning Framework for Energy Management of Extended Range Electric Delivery Vehicles , 2019, 2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM).

[7]  K. T. Chau,et al.  Overview of power management in hybrid electric vehicles , 2002 .

[8]  Tingwen Huang,et al.  Model-Free Optimal Tracking Control via Critic-Only Q-Learning , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Gangfeng Tan,et al.  Parameter Analysis on Torque Stabilization for the Eddy Current Brake: A Developed Model, Simulation, and Sensitive Analysis , 2015 .

[10]  Gangfeng Tan,et al.  Relationship between Braking Force and Pedal Force of a Pedal Controlled Parallelized Energy-Recuperation Retarder System , 2014 .

[11]  Hongwen He,et al.  Continuous reinforcement learning of energy management with deep Q network for a power split hybrid electric bus , 2018, Applied Energy.

[12]  Jingda Wu,et al.  Energy management based on reinforcement learning with double deep Q-learning for a hybrid electric tracked vehicle , 2019, Applied Energy.

[13]  Rui Xiong,et al.  Reinforcement Learning-based Real-time Energy Management for Plug-in Hybrid Electric Vehicle with Hybrid Energy Storage System , 2017 .

[14]  Quan Zhou,et al.  New traction motor sizing strategy for an HEV/EV based on an overcurrent‐tolerant prediction model , 2018, IET Intelligent Transport Systems.

[15]  Ji Li,et al.  University of Birmingham Cyber-physical energy-saving control for hybrid aircraft-towing tractor based on online swarm intelligent programming , 2018 .

[16]  Manuel Moreno-Eguilaz,et al.  Drive Cycle Identification and Energy Demand Estimation for Refuse-Collecting Vehicles , 2015, IEEE Transactions on Vehicular Technology.

[17]  Bin Shuai,et al.  Multi-step reinforcement learning for model-free predictive energy management of an electrified off-highway vehicle , 2019 .

[18]  Tzuu-Hseng S. Li,et al.  Backward Q-learning: The combination of Sarsa algorithm and Q-learning , 2013, Eng. Appl. Artif. Intell..

[19]  Hongwen He,et al.  Deep Reinforcement Learning-Based Energy Management for a Series Hybrid Electric Vehicle Enabled by History Cumulative Trip Information , 2019, IEEE Transactions on Vehicular Technology.

[20]  Quan Zhou,et al.  Development of a Series Hybrid Electric Aircraft Pushback Vehicle: A Case Study , 2019 .

[21]  Laurence T. Yang,et al.  A Double Deep Q-Learning Model for Energy-Efficient Edge Scheduling , 2019, IEEE Transactions on Services Computing.

[22]  Dongpu Cao,et al.  Reinforcement Learning Optimized Look-Ahead Energy Management of a Parallel Hybrid Electric Vehicle , 2017, IEEE/ASME Transactions on Mechatronics.

[23]  Anders Grauers,et al.  Convex Optimization Methods for Powertrain Sizing of Electrified Vehicles by Using Different Levels of Modeling Details , 2018, IEEE Transactions on Vehicular Technology.

[24]  Mehdi Karbalaye Zadeh,et al.  An Intelligent Power and Energy Management System for Fuel Cell/Battery Hybrid Electric Vehicle Using Reinforcement Learning , 2019, 2019 IEEE Transportation Electrification Conference and Expo (ITEC).

[25]  Xiaosong Hu,et al.  Velocity Predictors for Predictive Energy Management in Hybrid Electric Vehicles , 2015, IEEE Transactions on Control Systems Technology.

[26]  Yi Lu Murphey,et al.  Optimal Power Management Based on Q-Learning and Neuro-Dynamic Programming for Plug-in Hybrid Electric Vehicles , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[27]  R Bellman,et al.  DYNAMIC PROGRAMMING AND A NEW FORMALISM IN THE CALCULUS OF VARIATIONS. , 1954, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Radu-Emil Precup,et al.  Data-driven model-free slip control of anti-lock braking systems using reinforcement Q-learning , 2018, Neurocomputing.

[29]  Hongwen He,et al.  Rule based energy management strategy for a series–parallel plug-in hybrid electric bus optimized by dynamic programming , 2017 .

[30]  Suk Won Cha,et al.  Energy management strategy of hybrid electric vehicle using battery state of charge trajectory information , 2017 .

[31]  Marco Wiering,et al.  Learning to Play Pac-Xon with Q-Learning and Two Double Q-Learning Variants , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).

[32]  Weihao Hu,et al.  A Heuristic Planning Reinforcement Learning-Based Energy Management for Power-Split Plug-in Hybrid Electric Vehicles , 2019, IEEE Transactions on Industrial Informatics.

[33]  Denise M. Rizzo,et al.  An Integrated Design and Control Optimization Framework for Hybrid Military Vehicle Using Lithium-Ion Battery and Supercapacitor as Energy Storage Devices , 2019, IEEE Transactions on Transportation Electrification.

[34]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[35]  Ji Li,et al.  Back-to-Back Competitive Learning Mechanism for Fuzzy Logic Based Supervisory Control System of Hybrid Electric Vehicles , 2020, IEEE Transactions on Industrial Electronics.