Mobility-Aware Charging Scheduling for Shared On-Demand Electric Vehicle Fleet Using Deep Reinforcement Learning

With the emerging concept of sharing-economy, shared electric vehicles (EVs) are playing a more and more important role in future mobility-on-demand traffic system. This article considers joint charging scheduling, order dispatching, and vehicle rebalancing for large-scale shared EV fleet operator. To maximize the welfare of fleet operator, we model the joint decision making as a partially observable Markov decision process (POMDP) and apply deep reinforcement learning (DRL) combined with binary linear programming (BLP) to develop a near-optimal solution. The neural network is used to evaluate the state value of EVs at different times, locations, and states of charge. Based on the state value, dynamic electricity prices and order information, the online scheduling is modeled as a BLP problem where the decision variables representing whether an EV will 1) take an order, 2) rebalance to a position, or 3) charge. We also propose a constrained rebalancing method to improve the exploration efficiency of training. Moreover, we provide a tabular method with proved convergence as a fallback option to demonstrate the near-optimal characteristics of the proposed approach. Simulation experiments with real-world data from Haikou City verify the effectiveness of the proposed method.

[1]  Huichun Hua,et al.  Agent-Based Modeling in Electricity Market Using Deep Deterministic Policy Gradient Algorithm , 2020, IEEE Transactions on Power Systems.

[2]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[3]  Dimitri P. Bertsekasy Weighted Sup-Norm Contractions in Dynamic Programming: A Review and Some New Applications , 2012 .

[4]  Marco Pavone,et al.  Stochastic Model Predictive Control for Autonomous Mobility on Demand , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[5]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[6]  Mohammad Shahidehpour,et al.  Optimal Electric Vehicle Charging Strategy With Markov Decision Process and Reinforcement Learning Technique , 2020, IEEE Transactions on Industry Applications.

[7]  Colin P.D. Birch,et al.  Rectangular and hexagonal grids used for observation, experiment and simulation in ecology , 2007 .

[8]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[9]  Zhe Xu,et al.  Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning , 2018, KDD.

[10]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[11]  W. Marsden I and J , 2012 .

[12]  Maksym A. Girnyk,et al.  Deep reinforcement learning approach to MIMO precoding problem: Optimality and Robustness , 2020, ArXiv.

[13]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[14]  Haibo He,et al.  Model-Free Real-Time EV Charging Scheduling Based on Deep Reinforcement Learning , 2019, IEEE Transactions on Smart Grid.

[15]  Ming Zhou,et al.  Mean Field Multi-Agent Reinforcement Learning , 2018, ICML.

[16]  Jun Wang,et al.  Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning , 2019, WWW.

[17]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[18]  Csaba Szepesvári,et al.  A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms , 1999, Neural Computation.

[19]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[20]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[21]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[22]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[23]  Tom M. Apostol,et al.  A proof that euler missed: evaluating ζ(2) the easy way , 1983 .

[24]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[25]  Zhe Xu,et al.  A Deep Value-network Based Approach for Multi-Driver Order Dispatching , 2019, KDD.

[26]  Dynamic Pricing and Management for Electric Autonomous Mobility on Demand Systems Using Reinforcement Learning , 2019, ArXiv.

[27]  Zhe Xu,et al.  Large-Scale Order Dispatch in On-Demand Ride-Hailing Platforms: A Learning and Planning Approach , 2018, KDD.

[28]  Tom Holvoet,et al.  Reinforcement Learning of Heuristic EV Fleet Charging in a Day-Ahead Electricity Market , 2015, IEEE Transactions on Smart Grid.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[31]  Damien Ernst,et al.  How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies , 2015, ArXiv.

[32]  Wei Wang,et al.  Operating Electric Vehicle Fleet for Ride-Hailing Services With Reinforcement Learning , 2020, IEEE Transactions on Intelligent Transportation Systems.

[33]  Yang Xia,et al.  Centralized and decentralized autonomous dispatching strategy for dynamic autonomous taxi operation in hybrid request mode , 2020 .

[34]  Hai Yang,et al.  Hexagon-Based Convolutional Neural Network for Supply-Demand Forecasting of Ride-Sourcing Services , 2019, IEEE Transactions on Intelligent Transportation Systems.

[35]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[36]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[37]  Jieping Ye,et al.  CoRide: Joint Order Dispatching and Fleet Management for Multi-Scale Ride-Hailing Platforms , 2019, CIKM.

[38]  Marco Pavone,et al.  On the Interaction Between Autonomous Mobility-on-Demand Systems and the Power Network: Models and Coordination Algorithms , 2017, IEEE Transactions on Control of Network Systems.

[39]  Marco Pavone,et al.  Model predictive control of autonomous mobility-on-demand systems , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).