Second-Order Destination Inference using Semi-Supervised Self-Training for Entry-Only Passenger Data

Automated data collection in urban transportation systems produces a large volume of passenger data. However, quite a few of the data are still incomplete, limiting the insight into passenger mobility. The unavailability of destination information in entry-only passenger data is a very common issue. Traditional approaches for estimating passenger destinations rely on heuristics that can recover only some of the missing destinations. To deal with the remaining incomplete data, this paper, for the first time, proposes a second-order inference methodology to leverage semi-supervised self-training to infer the missing destinations. The methodology involves the design of a base learner to predict the missing destinations based on the statistics of a selected similarity-based "training set", and the design of a selection strategy to select new data with high prediction confidence to update the training set. To further improve the inference, we incorporate personal history priors to modify the base learner. We evaluate our designs using two data sources: a real-data inspired traffic-passenger behavior simulation in the city of Porto, Portugal, and the real bus Automated Fare Collection (AFC) data collected from the same city. The experimental results show that compared to baseline methods that do not use self-training, our approach significantly improves the inference performance and achieves notably high accuracies.

[1]  Ram Rajagopal,et al.  Parking Sensing and Information System: Sensors, Deployment, and Evaluation , 2016, ArXiv.

[2]  Jinhua Zhao,et al.  Estimating a Rail Passenger Trip Origin‐Destination Matrix Using Automatic Data Collection Systems , 2007, Comput. Aided Civ. Infrastructure Eng..

[3]  Howard Slavin,et al.  Use of Entry-Only Automatic Fare Collection Data to Estimate Linked Transit Trips in New York City , 2009 .

[4]  Janine M Farzin Constructing an Automated Bus Origin–Destination Matrix Using Farecard and Global Positioning System Data in São Paulo, Brazil , 2008 .

[5]  Margaret Martonosi,et al.  Human mobility modeling at metropolitan scales , 2012, MobiSys '12.

[6]  Yuanqing Li,et al.  A self-training semi-supervised SVM algorithm and its application in an EEG-based brain computer interface speller system , 2008, Pattern Recognit. Lett..

[7]  João Falcão e Cunha,et al.  Passenger Journey Destination Estimation From Automated Fare Collection System Data Using Spatial Validation , 2016, IEEE Transactions on Intelligent Transportation Systems.

[8]  João Gama,et al.  Predicting Taxi–Passenger Demand Using Streaming Data , 2013, IEEE Transactions on Intelligent Transportation Systems.

[9]  R. Jeong,et al.  Bus arrival time prediction using artificial neural network model , 2004, Proceedings. The 7th International IEEE Conference on Intelligent Transportation Systems (IEEE Cat. No.04TH8749).

[10]  Adam Rahbee,et al.  Origin and Destination Estimation in New York City with Automated Fare System Data , 2002 .

[11]  Michel Ferreira,et al.  Time-evolving O-D matrix estimation using high-speed GPS data streams , 2016, Expert Syst. Appl..

[12]  Wei Wang,et al.  Bus Passenger Origin-Destination Estimation and Related Analyses , 2011 .

[13]  Hamideh Afsarmanesh,et al.  Semi-supervised self-training for decision tree classifiers , 2017, Int. J. Mach. Learn. Cybern..

[14]  Malachy Carey,et al.  A Method for Direct Estimation of Origin/Destination Trip Matrices , 1981 .

[15]  Marcela Munizaga,et al.  Validating travel behavior estimated from smartcard data , 2013 .

[16]  Nan Zou,et al.  Estimating a Transit Passenger Trip Origin-Destination Matrix Using Automatic Fare Collection System , 2011, DASFAA Workshops.

[17]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[18]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[19]  Jean-François Paiement,et al.  A Generative Model of Urban Activities from Cellular Data , 2018, IEEE Transactions on Intelligent Transportation Systems.

[20]  Martin Trépanier,et al.  Individual Trip Destination Estimation in a Transit Smart Card Automated Fare Collection System , 2007, J. Intell. Transp. Syst..

[21]  Oded Cats,et al.  Toward a Demand Estimation Model Based on Automated Vehicle Location , 2016 .

[22]  Zhu Yonggang,et al.  Study on the Method of Constructing Bus Stops OD Matrix Based on IC Card Data , 2007, 2007 International Conference on Wireless Communications, Networking and Mobile Computing.

[23]  Catherine Morency,et al.  Smart card data use in public transit: A literature review , 2011 .

[24]  Xiaobo Liu,et al.  A Dynamic Bus‐Arrival Time Prediction Model Based on APC Data , 2004 .

[25]  Andreas Leich,et al.  SUMO 2016 – Traffic, Mobility, and Logistics , 2016 .

[26]  Keemin Sohn,et al.  Deep-learning architecture to forecast destinations of bus passengers from entry-only smart-card data , 2017 .