论文信息 - Optimal Passenger-Seeking Policies on E-hailing Platforms Using Markov Decision Process and Imitation Learning

Optimal Passenger-Seeking Policies on E-hailing Platforms Using Markov Decision Process and Imitation Learning

Vacant taxi drivers' passenger seeking process in a road network generates additional vehicle miles traveled, adding congestion and pollution into the road network and the environment. This paper aims to employ a Markov Decision Process (MDP) to model idle e-hailing drivers' optimal sequential decisions in passenger-seeking. Transportation network companies (TNC) or e-hailing (e.g., Didi, Uber) drivers exhibit different behaviors from traditional taxi drivers because e-hailing drivers do not need to actually search for passengers. Instead, they reposition themselves so that the matching platform can match a passenger. Accordingly, we incorporate e-hailing drivers' new features into our MDP model. The reward function used in the MDP model is uncovered by leveraging an inverse reinforcement learning technique. We then use 44,160 Didi drivers' 3-day trajectories to train the model. To validate the effectiveness of the model, a Monte Carlo simulation is conducted to simulate the performance of drivers under the guidance of the optimal policy, which is then compared with the performance of drivers following one baseline heuristic, namely, the local hotspot strategy. The results show that our model is able to achieve a 17.5% improvement over the local hotspot strategy in terms of the rate of return. The proposed MDP model captures the supply-demand ratio considering the fact that the number of drivers in this study is sufficiently large and thus the number of unmatched orders is assumed to be negligible. To better incorporate the competition among multiple drivers into the model, we have also devised and calibrated a dynamic adjustment strategy of the order matching probability.

[1] Hai Yang,et al. DEMAND-SUPPLY EQUILIBRIUM OF TAXI SERVICES IN A NETWORK UNDER COMPETITION AND REGULATION , 2002 .

[2] Yong Gao,et al. Optimize taxi driving strategies based on reinforcement learning , 2018, Int. J. Geogr. Inf. Sci..

[3] Xing Xie,et al. Where to find my next passenger , 2011, UbiComp '11.

[4] Hai Yang,et al. Empirical evidence for taxi customer-search model , 2010 .

[5] Wei Cao,et al. When Will You Arrive? Estimating Travel Time Based on Deep Neural Networks , 2018, AAAI.

[6] Zhe Xu,et al. Large-Scale Order Dispatch in On-Demand Ride-Hailing Platforms: A Learning and Planning Approach , 2018, KDD.

[7] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[8] Hui Xiong,et al. An energy-efficient mobile recommender system , 2010, KDD.

[9] W. Y. Szeto,et al. Sequential Logit Approach to Modeling the Customer-Search Decisions of Taxi Drivers , 2015 .

[10] Shing Chung Josh Wong,et al. Modeling Urban Taxi Services with Multiple User Classes and Vehicle Modes , 2008 .

[11] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[12] Hai Yang,et al. Nonlinear pricing of taxi services , 2010 .

[13] Hyoshin Park,et al. A Markov decision process approach to vacant taxi routing with e-hailing , 2019, Transportation Research Part B: Methodological.

[14] Ramayya Krishnan,et al. Understanding Sequential Decisions via Inverse Reinforcement Learning , 2013, 2013 IEEE 14th International Conference on Mobile Data Management.

[15] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[16] Zhe Xu,et al. Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning , 2018, KDD.

[17] Jing Wang,et al. Understanding consumers’ willingness to use ride-sharing services: The roles of perceived value and perceived risk , 2019, Transportation Research Part C: Emerging Technologies.

[18] Guannan Liu,et al. A cost-effective recommender system for taxi drivers , 2014, KDD.

[19] Shing Chung Josh Wong,et al. Modeling urban taxi services in congested road networks with elastic demand , 2001 .

[20] Hai Yang,et al. Taxi services with search frictions and congestion externalities , 2014 .

[21] Matthew Battifarano,et al. Predicting real-time surge pricing of ride-sourcing companies , 2019, Transportation Research Part C: Emerging Technologies.

[22] Henry X. Liu,et al. Indifference bands for boundedly rational route switching , 2017 .

[23] Yi-Chang Chiu,et al. Modeling Routing Behavior for Vacant Taxicabs in Urban Traffic Networks , 2012 .

[24] W. Y. Szeto,et al. A cell-based logit-opportunity taxi customer-search model , 2014 .

[25] Hong Yang,et al. Modeling and Analysis of Daily Driving Patterns of Taxis in Reshuffled Ride-Hailing Service Market , 2019, Journal of Transportation Engineering, Part A: Systems.

[26] João Gama,et al. A predictive model for the passenger demand on a taxi network , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[27] Xuan Di,et al. Hybrid Extended Kalman Filtering Approach for Traffic Density Estimation along Signalized Arterials , 2010 .

[28] P. Mokhtarian,et al. What drives the use of ridehailing in California? Ordered probit models of the usage frequency of Uber and Lyft , 2019, Transportation Research Part C: Emerging Technologies.

[29] S. Ukkusuri,et al. Taxi market equilibrium with third-party hailing service , 2017 .

[30] Shing Chung Josh Wong,et al. Network Model of Urban Taxi Services: Improved Algorithm , 1998 .

[31] Chang Yang,et al. The Rich and the Poor: A Markov Decision Process Approach to Optimizing Taxi Driver Revenue Efficiency , 2016, CIKM.

[32] Fang He,et al. Pricing and penalty/compensation strategies of a taxi-hailing platform , 2018 .

[33] Shing Chung Josh Wong,et al. A NETWORK MODEL OF URBAN TAXI SERVICES , 1998 .

[34] Ren-Hung Hwang,et al. An effective taxi recommender system based on a spatio-temporal factor analysis model , 2015, Inf. Sci..

[35] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[36] Ying Chen,et al. Hunting image: Taxi search strategy recognition using Sparse Subspace Clustering , 2019 .

[37] Zuo-Jun Max Shen,et al. Modeling taxi services with smartphone-based e-hailing applications , 2015 .

[38] Shing Chung Josh Wong,et al. Equlibrium of Bilateral Taxi-Customer Searching and Meeting on Networks , 2010 .

[39] Carlo Ratti,et al. Taxi-Aware Map: Identifying and Predicting Vacant Taxis in the City , 2010, AmI.

[40] Hai Yang,et al. Equilibrium properties of taxi markets with search frictions , 2011 .

[41] W. Y. Szeto,et al. A two-stage approach to modeling vacant taxi movements , 2015 .

[42] W. Y. Szeto,et al. Modelling multi-period customer-searching behaviour of taxi drivers , 2014 .

[43] Lin Sun,et al. Hunting or waiting? Discovering passenger-finding strategies from a large-scale real-world taxi dataset , 2011, 2011 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops).

[44] Sarit Kraus,et al. Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Improving Revenues , 2017, ICAPS.

[45] Favyen Bastani,et al. Towards Reducing Taxicab Cruising Time Using Spatio-Temporal Profitability Maps , 2011, SSTD.

[46] Xuan Di,et al. Similarity analysis of frequent sequential activity pattern mining , 2018, Transportation Research Part C: Emerging Technologies.

[47] Liang Liu,et al. Uncovering cabdrivers' behavior patterns from their digital traces , 2010, Comput. Environ. Urban Syst..

[48] Alex X. Liu,et al. Optimizing Taxi Driver Profit Efficiency: A Spatial Network-Based Markov Decision Process Approach , 2020, IEEE Transactions on Big Data.

[49] Francisco C. Pereira,et al. Predicting taxi demand hotspots using automated Internet Search Queries , 2019, Transportation Research Part C: Emerging Technologies.

[50] W. Y. Szeto,et al. A time-dependent logit-based taxi customer-search model , 2013 .

[51] Fang Liu,et al. A Two-Layer Model for Taxi Customer Searching Behaviors Using GPS Trajectory Data , 2016, IEEE Transactions on Intelligent Transportation Systems.

[52] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[53] Zijian Liu,et al. A novel generative adversarial network for estimation of trip travel time distribution with trajectory data , 2019, Transportation Research Part C: Emerging Technologies.

[54] Shing Chung Josh Wong,et al. Modeling the bilateral micro-searching behavior for urban taxi services using the absorbing Markov chain approach , 2005 .

[55] Xuan Di,et al. A unified equilibrium framework of new shared mobility systems , 2019, Transportation Research Part B: Methodological.