Using Reinforcement Learning to Solve a Dynamic Orienteering Problem with Random Rewards Affected by the Battery Status

This paper discusses an orienteering optimization problem where a vehicle using electric batteries must travel from an origin depot to a destination depot while maximizing the total reward collected along its route. The vehicle must cross several consecutive regions, with each region containing different types of charging nodes. A charging node has to be selected in each region, and the reward for visiting each node—in terms of a ‘satisfactory’ charging process—is a binary random variable that depends upon dynamic factors such as the type of charging node, weather conditions, congestion, battery status, etc. To learn how to efficiently operate in this dynamic environment, a hybrid methodology combining simulation with reinforcement learning is proposed. The reinforcement learning component is able to make informed decisions at each stage, while the simulation component is employed to validate the learning process. The computational experiments show how the proposed methodology is capable of design routing plans that are significantly better than non-informed decisions, thus allowing for an efficient management of the vehicle’s battery under such dynamic conditions.

[1]  D. Sauer,et al.  Power curves of megawatt-scale battery storage technologies for frequency regulation and energy trading , 2023, Applied Energy.

[2]  Fayez Alanazi Electric Vehicles: Benefits, Challenges, and Potential Solutions for Widespread Adaptation , 2023, Applied Sciences.

[3]  Xiongwei Wu,et al.  Prediction of Remaining Useful Life and State of Health of Lithium Batteries Based on Time Series Feature and Savitzky-Golay Filter Combined with Gated Recurrent Unit Neural Network , 2023, SSRN Electronic Journal.

[4]  Muhammad Shahid Mastoi,et al.  An in-depth analysis of electric vehicle charging station infrastructure, policy implications, and future trends , 2022, Energy Reports.

[5]  R. Gouws,et al.  A Comparative Review of Lead-Acid, Lithium-Ion and Ultra-Capacitor Technologies and Their Degradation Mechanisms , 2022, Energies.

[6]  Gretchen A. Macht,et al.  Quantifying the Impact of Traffic on Electric Vehicle Efficiency , 2022, World Electric Vehicle Journal.

[7]  Izzatul Umami,et al.  Comparing Epsilon Greedy and Thompson Sampling model for Multi-Armed Bandit algorithm on Marketing Dataset , 2021, Journal of Applied Data Sciences.

[8]  Ricardo Gama,et al.  A Reinforcement Learning Approach to the Orienteering Problem with Time Windows , 2020, Comput. Oper. Res..

[9]  Michela Longo,et al.  Electric Vehicles Charging Technology Review and Optimal Size Estimation , 2020, Journal of Electrical Engineering & Technology.

[10]  Daniel L. Campbell,et al.  Degradation of Commercial Lithium-Ion Cells as a Function of Chemistry and Cycling Conditions , 2020 .

[11]  Angel A. Juan,et al.  Maximising reward from a team of surveillance drones: a simheuristic approach to the stochastic team orienteering problem , 2020 .

[12]  Angel A. Juan,et al.  A learnheuristic approach for the team orienteering problem with aerial drone motion constraints , 2020, Appl. Soft Comput..

[13]  Djamila Ouelhadj,et al.  The location routing problem using electric vehicles with constrained distance , 2020, Comput. Oper. Res..

[14]  Jun Bi,et al.  A data-based model for driving distance estimation of battery electric logistics vehicles , 2018, EURASIP J. Wirel. Commun. Netw..

[15]  Shanhai Ge,et al.  Fast charging of lithium-ion batteries at all temperatures , 2018, Proceedings of the National Academy of Sciences.

[16]  Guoyuan Wu,et al.  Data-driven decomposition analysis and estimation of link-level electric vehicle energy consumption under real-world traffic conditions , 2017, Transportation Research Part D: Transport and Environment.

[17]  Anna Veronika Dorogush,et al.  CatBoost: unbiased boosting with categorical features , 2017, NeurIPS.

[18]  Jiangping Chen,et al.  Climate control loads prediction of electric vehicles , 2017 .

[19]  Manuel Chica,et al.  Why Simheuristics? Benefits, Limitations, and Best Practices When Combining Metaheuristics with Simulation , 2017, SSRN Electronic Journal.

[20]  El Houssaine Aghezzaf,et al.  Solving the stochastic time-dependent orienteering problem with time windows , 2016, Eur. J. Oper. Res..

[21]  Hoong Chuin Lau,et al.  Orienteering Problem: A survey of recent variants, solution approaches and applications , 2016, Eur. J. Oper. Res..

[22]  G. V. Avvari,et al.  Optimal battery charging, Part I: Minimizing time-to-charge, energy loss, and temperature rise for OCV-resistance battery model , 2016 .

[23]  Vigna Kumaran Ramachandaramurthy,et al.  A review on the state-of-the-art technologies of electric vehicle, its impacts and prospects , 2015 .

[24]  Shu Zhang,et al.  A priori orienteering with time windows and stochastic wait times at customers , 2014, Eur. J. Oper. Res..

[25]  Suzanne van der Ster,et al.  A two-stage approach to the orienteering problem with stochastic weights , 2014, Comput. Oper. Res..

[26]  Pradeep Varakantham,et al.  Optimization Approaches for Solving Chance Constrained Stochastic Orienteering Problems , 2013, ADT.

[27]  Michel Gendreau,et al.  The orienteering problem with stochastic travel and service times , 2011, Ann. Oper. Res..

[28]  Mark S. Daskin,et al.  The orienteering problem with stochastic profits , 2008 .

[29]  Y. Yao,et al.  On Early Stopping in Gradient Descent Learning , 2007 .

[30]  R. Vohra,et al.  The Orienteering Problem , 1987 .

[31]  Huamei Li,et al.  Accelerating perovskite materials discovery and correlated energy applications through artificial intelligence , 2022, Energy Materials.

[32]  D. Nikitin,et al.  Influence of temperature on the performance and life cycle of storage batteries , 2021, Transportation Research Procedia.

[33]  Angel A. Juan,et al.  Agile optimization of a two-echelon vehicle routing problem with pickup and delivery , 2021, Int. Trans. Oper. Res..

[34]  Thomas J. Böhme,et al.  Hybrid Systems, Optimal Control and Hybrid Vehicles , 2017 .

[35]  Tim Schwanen,et al.  Differences in Energy Consumption in Electric Vehicles: An Exploratory Real-World Study in Beijing , 2017 .

[36]  Christoph Herrmann,et al.  Determining the Main Factors Influencing the Energy Consumption of Electric Vehicles in the Usage Phase , 2016 .

[37]  K. T. Chau,et al.  Pure electric vehicles , 2014 .

[38]  Ji‐Guang Zhang,et al.  Optimized Operating Range for Large-Format LiFePO4/Graphite Batteries , 2014 .

[39]  R. Montemanni,et al.  Objective function evaluation methods for the orienteering problem with stochastic travel and service times , 2014 .