Data-driven inverse learning of passenger preferences in urban public transits

Urban public transit planning is crucial in reducing traffic congestion and enabling green transportation. However, there is no systematic way to integrate passengers' personal preferences in planning public transit routes and schedules so as to achieve high occupancy rates and efficiency gain of ride-sharing. In this paper, we take the first step tp exact passengers' preferences in planning from history public transit data. We propose a data-driven method to construct a Markov decision process model that characterizes the process of passengers making sequential public transit choices, in bus routes, subway lines, and transfer stops/stations. Using the model, we integrate softmax policy iteration into maximum entropy inverse reinforcement learning to infer the passenger's reward function from observed trajectory data. The inferred reward function will enable an urban planner to predict passengers' route planning decisions given some proposed transit plans, for example, opening a new bus route or subway line. Finally, we demonstrate the correctness and accuracy of our modeling and inference methods in a large-scale (three months) passenger-level public transit trajectory data from Shenzhen, China. Our method contributes to smart transportation design and human-centric urban planning.

[1]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[2]  Elizabeth L. Wilmer,et al.  Markov Chains and Mixing Times , 2008 .

[3]  Fan Zhang,et al.  Growing the charging station network for electric vehicles with trajectory data analytics , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[4]  Wolfram Burgard,et al.  Inverse reinforcement learning of behavioral models for online-adapting navigation strategies , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[6]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[7]  Michael Bloem,et al.  Infinite Time Horizon Maximum Causal Entropy Inverse Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.

[8]  Carla E. Brodley,et al.  Proceedings of the twenty-first international conference on Machine learning , 2004, International Conference on Machine Learning.

[9]  Ugo Lachapelle,et al.  Commuting by public transit and physical activity: where you live, where you work, and how you get there. , 2011, Journal of physical activity & health.

[10]  Sergey Levine,et al.  Nonlinear Inverse Reinforcement Learning with Gaussian Processes , 2011, NIPS.

[11]  Ting Zhu,et al.  Region sampling and estimation of geosocial data with dynamic range calibration , 2014, 2014 IEEE 30th International Conference on Data Engineering.