Deep Reinforcement Learning for Solving the Heterogeneous Capacitated Vehicle Routing Problem

Existing deep reinforcement learning (DRL)-based methods for solving the capacitated vehicle routing problem (CVRP) intrinsically cope with a homogeneous vehicle fleet, in which the fleet is assumed as repetitions of a single vehicle. Hence, their key to construct a solution solely lies in the selection of the next node (customer) to visit excluding the selection of vehicle. However, vehicles in real-world scenarios are likely to be heterogeneous with different characteristics that affect their capacity (or travel speed), rendering existing DRL methods less effective. In this article, we tackle heterogeneous CVRP (HCVRP), where vehicles are mainly characterized by different capacities. We consider both min-max and min-sum objectives for HCVRP, which aim to minimize the longest or total travel time of the vehicle(s) in the fleet. To solve those problems, we propose a DRL method based on the attention mechanism with a vehicle selection decoder accounting for the heterogeneous fleet constraint and a node selection decoder accounting for the route construction, which learns to construct a solution by automatically selecting both a vehicle and a node for this vehicle at each step. Experimental results based on randomly generated instances show that, with desirable generalization to various problem sizes, our method outperforms the state-of-the-art DRL method and most of the conventional heuristics, and also delivers competitive performance against the state-of-the-art heuristic method, that is, slack induction by string removal. In addition, the results of extended experiments demonstrate that our method is also able to solve CVRPLib instances with satisfactory performance.

[1]  Lawrence V. Snyder,et al.  Reinforcement Learning for Solving the Vehicle Routing Problem , 2018, NeurIPS.

[2]  Jie Zhang,et al.  Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing Problems , 2020, AAAI.

[3]  Andres G. Abad,et al.  Deep Reinforcement Learning for Routing a Heterogeneous Fleet of Vehicles , 2019, 2019 IEEE Latin American Conference on Computational Intelligence (LA-CCI).

[4]  Xingyi Zhang,et al.  An Evolutionary Multiobjective Route Grouping-Based Heuristic Algorithm for Large-Scale Capacitated Vehicle Routing Problems , 2019, IEEE Transactions on Cybernetics.

[5]  Ye Tian,et al.  Efficient Large-Scale Multiobjective Optimization Based on a Competitive Swarm Optimizer , 2020, IEEE Transactions on Cybernetics.

[6]  J. F. Pierce,et al.  ON THE TRUCK DISPATCHING PROBLEM , 1971 .

[7]  Peng Jiang,et al.  BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer , 2019, CIKM.

[8]  Derong Liu,et al.  Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints , 2015, IEEE Transactions on Cybernetics.

[9]  Xiaofeng Liao,et al.  Reinforcement Learning for Constrained Energy Trading Games With Incomplete Information , 2017, IEEE Transactions on Cybernetics.

[10]  Bruce L. Golden,et al.  The vehicle routing problem with drones: several worst-case results , 2017, Optim. Lett..

[11]  Zexuan Zhu,et al.  Solving Generalized Vehicle Routing Problem With Occasional Drivers via Evolutionary Multitasking , 2019, IEEE Transactions on Cybernetics.

[12]  Yingchun Chen,et al.  Optimization of Special Vehicle Routing Problem Based on Ant Colony System , 2006, ICIC.

[13]  Christian Prins,et al.  Efficient Heuristics for the Heterogeneous Fleet Multitrip VRP with Application to a Large-Scale Real Case , 2002, J. Math. Model. Algorithms.

[14]  Stella Sofianopoulou,et al.  A firefly algorithm for the heterogeneous fixed fleet vehicle routing problem , 2019, International Journal of Industrial and Systems Engineering.

[15]  Zhou Yu,et al.  Multimodal Transformer With Multi-View Visual Representation for Image Captioning , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Roberto Baldacci,et al.  A unified exact method for solving different classes of vehicle routing problems , 2009, Math. Program..

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Alexander H. G. Rinnooy Kan,et al.  Bounds and Heuristics for Capacitated Routing Problems , 1985, Math. Oper. Res..

[19]  Thibaut Vidal,et al.  New benchmark instances for the Capacitated Vehicle Routing Problem , 2017, Eur. J. Oper. Res..

[20]  Zhiguang Cao,et al.  Heterogeneous Attentions for Solving Pickup and Delivery Problem via Deep Reinforcement Learning , 2021, IEEE Transactions on Intelligent Transportation Systems.

[21]  Saeid Nahavandi,et al.  Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications , 2018, IEEE Transactions on Cybernetics.

[22]  Andrew Lim,et al.  Learning Improvement Heuristics for Solving Routing Problems , 2019 .

[23]  Ertan Yakici,et al.  A heuristic approach for solving a rich min-max vehicle routing problem with mixed fleet and mixed demand , 2017, Comput. Ind. Eng..

[24]  Cigdem Alabas-Uslu A self-tuning heuristic for a multi-objective vehicle routing problem , 2008, J. Oper. Res. Soc..

[25]  Samy Bengio,et al.  Order Matters: Sequence to sequence for sets , 2015, ICLR.

[26]  Xiangyong Li,et al.  An adaptive memory programming metaheuristic for the heterogeneous fixed fleet vehicle routing problem , 2010 .

[27]  Max Welling,et al.  Attention, Learn to Solve Routing Problems! , 2018, ICLR.

[28]  Luca Bertazzi,et al.  Min-Max vs. Min-Sum Vehicle Routing: A worst-case analysis , 2015, Eur. J. Oper. Res..

[29]  Adam N. Letchford,et al.  A new branch-and-cut algorithm for the capacitated vehicle routing problem , 2004, Math. Program..

[30]  Yi Yang,et al.  Entangled Transformer for Image Captioning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Ivona Brajevic,et al.  Artificial bee colony algorithm for the capacitated vehicle routing problem , 2011 .

[32]  A. Eltawil,et al.  Solving the Heterogeneous Capacitated Vehicle Routing Problem using K-Means Clustering and Valid Inequalities , 2017 .

[33]  Pinar Keskinocak,et al.  Pre-Positioning of Emergency Items for CARE International , 2011, Interfaces.

[34]  Yanguang Cai,et al.  Variable neighborhood search for consistent vehicle routing problem , 2018, Expert Syst. Appl..

[35]  Gilbert Laporte,et al.  A hybrid evolutionary algorithm for heterogeneous fleet vehicle routing problems with time windows , 2015, Comput. Oper. Res..

[36]  Markus Olhofer,et al.  Test Problems for Large-Scale Multiobjective and Many-Objective Optimization , 2017, IEEE Transactions on Cybernetics.

[37]  Carlos D. Paternina-Arboleda,et al.  A Two-Pheromone Trail Ant Colony System Approach for the Heterogeneous Vehicle Routing Problem with Time Windows, Multiple Products and Product Incompatibility , 2019, ICCL.

[38]  Gilbert Laporte,et al.  An adaptive memory heuristic for a class of vehicle routing problems with minmax objective , 1997, Comput. Oper. Res..

[39]  Bruce L. Golden,et al.  The fleet size and mix vehicle routing problem , 1984, Comput. Oper. Res..

[40]  Ruslan Sadykov,et al.  A generic exact solver for vehicle routing and related problems , 2019, Mathematical Programming.

[41]  Zhiguang Cao,et al.  Step-Wise Deep Learning Models for Solving Routing Problems , 2020, IEEE Transactions on Industrial Informatics.

[42]  Frank L. Lewis,et al.  Optimized Assistive Human–Robot Interaction Using Reinforcement Learning , 2016, IEEE Transactions on Cybernetics.

[43]  Emrana Kabir Hashi,et al.  GIS based heuristic solution of the vehicle routing problem to optimize the school bus routing and scheduling , 2016, 2016 19th International Conference on Computer and Information Technology (ICCIT).

[44]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[45]  R. Bellman A Markovian Decision Process , 1957 .

[46]  F. McNeill,et al.  Grid search: an innovative method for the estimation of the rates of lead exchange between body compartments. , 2005, Journal of environmental monitoring : JEM.

[47]  Michel Gendreau,et al.  The m-Traveling Salesman Problem with Minmax Objective , 1995, Transp. Sci..

[48]  Wei Li,et al.  Behavior sequence transformer for e-commerce recommendation in Alibaba , 2019, Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data.

[49]  Tieshan Li,et al.  Adaptive Reinforcement Learning Neural Network Control for Uncertain Nonlinear System With Input Saturation , 2020, IEEE Transactions on Cybernetics.

[50]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[51]  Yuandong Tian,et al.  Learning to Perform Local Rewriting for Combinatorial Optimization , 2019, NeurIPS.

[52]  Greet Van den Berghe,et al.  Slack Induction by String Removals for Vehicle Routing Problems , 2020, Transp. Sci..

[53]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[54]  Manish Kumar,et al.  An ant colony optimization technique for solving min-max Multi-Depot Vehicle Routing Problem , 2013, Swarm Evol. Comput..

[55]  Xin Ma,et al.  Min-max robust optimization for the Wounded Transfer Problem in large-scale emergencies , 2010, 2010 Chinese Control and Decision Conference.

[56]  Yaochu Jin,et al.  A Competitive Swarm Optimizer for Large Scale Optimization , 2015, IEEE Transactions on Cybernetics.

[57]  Zilong Zhuang,et al.  A novel reinforcement learning-based hyper-heuristic for heterogeneous vehicle routing problem , 2021, Comput. Ind. Eng..

[58]  Manish Kumar,et al.  Ant colony optimization technique to solve the min-max Single Depot Vehicle Routing Problem , 2011, Proceedings of the 2011 American Control Conference.

[59]  Said Salhi,et al.  The cumulative capacitated vehicle routing problem with min-sum and min-max objectives: An effective hybridisation of adaptive variable neighbourhood search and large neighbourhood search , 2017 .

[60]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[61]  Jennie Si,et al.  Online Reinforcement Learning Control for the Personalization of a Robotic Knee Prosthesis , 2020, IEEE Transactions on Cybernetics.

[62]  Rui Zhang,et al.  The min-max split delivery multi-depot vehicle routing problem with minimum service time requirement , 2016, Comput. Oper. Res..