Dynamic Pricing and Management for Electric Autonomous Mobility on Demand Systems Using Reinforcement Learning

The proliferation of ride sharing systems is a major drive in the advancement of autonomous and electric vehicle technologies. This paper considers the joint routing, battery charging, and pricing problem faced by a profit-maximizing transportation service provider that operates a fleet of autonomous electric vehicles. We define the dynamic system model that captures the time dependent and stochastic features of an electric autonomous-mobility-on-demand system. To accommodate for the time-varying nature of trip demands, renewable energy availability, and electricity prices and to further optimally manage the autonomous fleet, a dynamic policy is required. In order to develop a dynamic control policy, we first formulate the dynamic progression of the system as a Markov decision process. We argue that it is intractable to exactly solve for the optimal policy using exact dynamic programming methods and therefore apply deep reinforcement learning to develop a near-optimal control policy. Furthermore, we establish the static planning problem by considering time-invariant system parameters. We define the capacity region and determine the optimal static policy to serve as a baseline for comparison with our dynamic policy. While the static policy provides important insights on optimal pricing and fleet management, we show that in a real dynamic setting, it is inefficient to utilize a static policy. The two case studies we conducted in Manhattan and San Francisco demonstrate the efficacy of our dynamic policy in terms of network stability and profits, while keeping the queue lengths up to 200 times less than the static policy.

[1]  Kara M. Kockelman,et al.  Operations of a Shared, Autonomous Electric Vehicle Fleet: Implications of Vehicle & Charging Infrastructure Decisions , 2016 .

[2]  Robert Babuska,et al.  A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[3]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[4]  Anuradha M. Annaswamy,et al.  Cumulative Prospect Theory Based Dynamic Pricing for Shared Mobility on Demand Services , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[5]  Emilio Frazzoli,et al.  Robotic load balancing for mobility-on-demand systems , 2012, Int. J. Robotics Res..

[6]  Emilio Frazzoli,et al.  Vehicle routing for shared-mobility systems with time-varying demand , 2016, 2016 American Control Conference (ACC).

[7]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[8]  Maxime Guériau,et al.  SAMoD: Shared Autonomous Mobility-on-Demand using Decentralized Reinforcement Learning , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[9]  Jean C. Walrand,et al.  Robust scheduling for flexible processing networks , 2017, Advances in Applied Probability.

[10]  Daniela Rus,et al.  Markov-based redistribution policy model for future urban mobility networks , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[11]  Mo-Yuen Chow,et al.  A Survey on the Electrification of Transportation in a Smart Grid Environment , 2012, IEEE Transactions on Industrial Informatics.

[12]  Dorsa Sadigh,et al.  Learning How to Dynamically Route Autonomous Vehicles on Shared Roads , 2019, ArXiv.

[13]  Nathaniel Tucker,et al.  Online Charge Scheduling for Electric Vehicles in Autonomous Mobility on Demand Fleets , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[14]  Abdeltawab M. Hendawi,et al.  Data-Driven Distributionally Robust Vehicle Balancing Using Dynamic Region Partitions , 2017, 2017 ACM/IEEE 8th International Conference on Cyber-Physical Systems (ICCPS).

[15]  Ramtin Pedarsani,et al.  Ride-Sharing Networks with Mixed Autonomy , 2019, 2019 American Control Conference (ACC).

[16]  Matthias Grossglauser,et al.  CRAWDAD dataset epfl/mobility (v.2009-02-24) , 2009 .

[17]  G. Dimitrakopoulos,et al.  Intelligent Transportation Systems , 2010, IEEE Vehicular Technology Magazine.

[18]  Marco Pavone,et al.  Control of robotic mobility-on-demand systems: A queueing-theoretical perspective , 2014, Int. J. Robotics Res..

[19]  Else Veldman,et al.  Distribution Grid Impacts of Smart Electric Vehicle Charging From Different Perspectives , 2015, IEEE Transactions on Smart Grid.

[20]  Marco Pavone,et al.  Stochastic Model Predictive Control for Autonomous Mobility on Demand , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[21]  Marco Pavone,et al.  Routing autonomous vehicles in congested transportation networks: structural properties and coordination algorithms , 2016, Autonomous Robots.

[22]  Anand R. Gopal,et al.  Joint Optimization Scheme for the Planning and Operations of Shared Autonomous Electric Vehicle Fleets Serving Mobility on Demand , 2019, Transportation Research Record: Journal of the Transportation Research Board.

[23]  Patrick Jaillet,et al.  Rebalancing shared mobility-on-demand systems: A reinforcement learning approach , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[24]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[25]  Marco Pavone,et al.  Model predictive control of autonomous mobility-on-demand systems , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[26]  George J. Pappas,et al.  Taxi Dispatch With Real-Time Sensing Data in Metropolitan Areas: A Receding Horizon Control Approach , 2016, IEEE Trans Autom. Sci. Eng..

[27]  J. Dai On Positive Harris Recurrence of Multiclass Queueing Networks: A Unified Approach Via Fluid Limit Models , 1995 .

[28]  Stéphane Bressan,et al.  Routing an Autonomous Taxi with Reinforcement Learning , 2016, CIKM.

[29]  K. Kockelman,et al.  Management of a Shared Autonomous Electric Vehicle Fleet: Implications of Pricing Schemes , 2016 .

[30]  Ian H. Witten,et al.  An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..

[31]  Marco Pavone,et al.  On the Interaction Between Autonomous Mobility-on-Demand Systems and the Power Network: Models and Coordination Algorithms , 2020, IEEE Transactions on Control of Network Systems.

[32]  Christos Cassandras,et al.  Load Balancing in Mobility-on-Demand Systems: Reallocation Via Parametric Control Using Concurrent Estimation , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[33]  Arobinda Gupta,et al.  A Review of Charge Scheduling of Electric Vehicles in Smart Grid , 2015, IEEE Systems Journal.

[34]  Nathaniel Tucker,et al.  Smart Charging Benefits in Autonomous Mobility on Demand Systems , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[35]  R. J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[36]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[37]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.