论文信息 - Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Improving Revenues

Augmenting Decisions of Taxi Drivers through Reinforcement Learning for Improving Revenues

Taxis (which include cars working with car aggregation systems such as Uber, Grab, Lyft etc.) have become a critical component in the urban transportation. While most research and applications in the context of taxis have focused on improving performance from a customer perspective, in this paper, we focus on improving performance from a taxi driver perspective. Higher revenues for taxi drivers can help bring more drivers into the system thereby improving availability for customers in dense urban cities. Typically, when there is no customer on board, taxi drivers will cruise around to find customers either directly (on the street) or indirectly (due to a request from a nearby customer on phone or on aggregation systems). For such cruising taxis, we develop a Reinforcement Learning (RL) based system to learn from real trajectory logs of drivers to advise them on the right locations to find customers which maximize their revenue. There are multiple translational challenges involved in building this RL system based on real data, such as annotating the activities (e.g., roaming, going to a taxi stand, etc.) observed in trajectory logs, identifying the right features for a state, action space and evaluating against real driver performance observed in the dataset. We also provide a dynamic abstraction mechanism to improve the basic learning mechanism. Finally, we provide a thorough evaluation on a realion mechanism to improve the basic learning mechanism. Finally, we provide a thorough evaluation on a real world data set from a developed Asian city and demonstrate that an RL based system can provide significant benefits to

[1] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.

[2] Guangzhong Sun,et al. Driving with knowledge from the physical world , 2011, KDD.

[3] Peter Stone,et al. State Abstraction Discovery from Irrelevant State Variables , 2005, IJCAI.

[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5] Guannan Liu,et al. A cost-effective recommender system for taxi drivers , 2014, KDD.

[6] Peter Stone,et al. Structure Learning in Ergodic Factored MDPs without Knowledge of the Transition Function's In-Degree , 2011, ICML.

[7] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[8] Peter Stone,et al. Generalized model learning for reinforcement learning in factored domains , 2009, AAMAS.

[9] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[10] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.

[11] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.

[12] Jeff G. Schneider,et al. Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning , 1996, NIPS.

[13] Favyen Bastani,et al. Towards Reducing Taxicab Cruising Time Using Spatio-Temporal Profitability Maps , 2011, SSTD.

[14] Nathan R. Sturtevant,et al. Speeding Up Learning in Real-time Search via Automatic State Abstraction , 2005, AAAI.

[15] Chunming Qiao,et al. Towards efficient vacant taxis Cruising Guidance , 2013, 2013 IEEE Global Communications Conference (GLOBECOM).

[16] Xing Xie,et al. Where to find my next passenger , 2011, UbiComp '11.

[17] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.