Deep Reinforcement Learning for Electric Vehicle Routing Problem with Time Windows

The past decade has seen a rapid penetration of electric vehicles (EV) in the market, more and more logistics and transportation companies start to deploy EVs for service provision. In order to model the operations of a commercial EV fleet, we utilize the EV routing problem with time windows (EVRPTW). In this research, we propose an end-to-end deep reinforcement learning framework to solve the EVRPTW. In particular, we develop an attention model incorporating the pointer network and a graph embedding technique to parameterize a stochastic policy for solving the EVRPTW. The model is then trained using policy gradient with rollout baseline. Our numerical studies show that the proposed model is able to efficiently solve EVRPTW instances of large sizes that are not solvable with any existing approaches.

[1]  Dominik Goeke,et al.  The Electric Vehicle-Routing Problem with Time Windows and Recharging Stations , 2014, Transp. Sci..

[2]  David L. Dill,et al.  Learning a SAT Solver from Single-Bit Supervision , 2018, ICLR.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Marius M. Solomon,et al.  Algorithms for the Vehicle Routing and Scheduling Problems with Time Window Constraints , 1987, Oper. Res..

[5]  Luís C. Lamb,et al.  Learning to Solve NP-Complete Problems - A Graph Neural Network for the Decision TSP , 2018, AAAI.

[6]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[7]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[8]  James J. Q. Yu,et al.  Online Vehicle Routing With Neural Combinatorial Optimization and Deep Reinforcement Learning , 2019, IEEE Transactions on Intelligent Transportation Systems.

[9]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[10]  Wei Wang,et al.  Operating Electric Vehicle Fleet for Ride-Hailing Services With Reinforcement Learning , 2020, IEEE Transactions on Intelligent Transportation Systems.

[11]  Stefan Irnich,et al.  Exact Algorithms for Electric Vehicle-Routing Problems with Time Windows , 2014, Oper. Res..

[12]  Zhuwen Li,et al.  Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search , 2018, NeurIPS.

[13]  J. J. Hopfield,et al.  “Neural” computation of decisions in optimization problems , 1985, Biological Cybernetics.

[14]  Kay W. Axhausen,et al.  Plug-in hybrid electric vehicles and smart grids: Investigations based on a microsimulation , 2013 .

[15]  Le Song,et al.  2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[16]  Lawrence V. Snyder,et al.  Reinforcement Learning for Solving the Vehicle Routing Problem , 2018, NeurIPS.

[17]  Le Song,et al.  Discriminative Embeddings of Latent Variable Models for Structured Data , 2016, ICML.

[18]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[19]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[20]  Xi Zhao,et al.  A Hybrid of Deep Reinforcement Learning and Local Search for the Vehicle Routing Problems , 2020, IEEE Transactions on Intelligent Transportation Systems.

[21]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[22]  Ambuj K. Singh,et al.  Learning Heuristics over Large Graphs via Deep Reinforcement Learning , 2019, ArXiv.

[23]  Jakob N. Foerster,et al.  Exploratory Combinatorial Optimization with Reinforcement Learning , 2020, AAAI.

[24]  David L. Waltz,et al.  Vehicle Electrification: Status and Issues , 2011, Proceedings of the IEEE.

[25]  G. Pawley,et al.  On the stability of the Travelling Salesman Problem algorithm of Hopfield and Tank , 2004, Biological Cybernetics.

[26]  Max Welling,et al.  Attention, Learn to Solve Routing Problems! , 2018, ICLR.