Hunting or waiting? Discovering passenger-finding strategies from a large-scale real-world taxi dataset

In modern cities, more and more vehicles, such as taxis, have been equipped with GPS devices for localization and navigation. Gathering and analyzing these large-scale real-world digital traces have provided us an unprecedented opportunity to understand the city dynamics and reveal the hidden social and economic “realities”. One innovative pervasive application is to provide correct driving strategies to taxi drivers according to time and location. In this paper, we aim to discover both efficient and inefficient passenger-finding strategies from a large-scale taxi GPS dataset, which was collected from 5350 taxis for one year in a large city of China. By representing the passenger-finding strategies in a Time-Location-Strategy feature triplet and constructing a train/test dataset containing both top- and ordinary-performance taxi features, we adopt a powerful feature selection tool, L1-Norm SVM, to select the most salient feature patterns determining the taxi performance. We find that the selected patterns can well interpret the empirical study results derived from raw data analysis and even reveal interesting hidden “facts”. Moreover, the taxi performance predictor built on the selected features can achieve a prediction accuracy of 85.3% on a new test dataset, and it also outperforms the one based on all the features, which implies that the selected features are indeed the right indicators of the passenger-finding strategies.

[1]  Hai Yang,et al.  Nonlinear pricing of taxi services , 2010 .

[2]  Jane Yung-jen Hsu,et al.  Context-aware taxi demand hotspots prediction , 2010, Int. J. Bus. Intell. Data Min..

[3]  Kentaro Uesugi,et al.  Adaptive routing of multiple taxis by mutual exchange of pathways , 2010, Int. J. Knowl. Eng. Soft Data Paradigms.

[4]  Jinbo Bi,et al.  Dimensionality Reduction via Sparse Support Vector Machines , 2003, J. Mach. Learn. Res..

[5]  Alex Pentland,et al.  A Network Analysis of Road Traffic with Vehicle Tracking Data , 2009, AAAI Spring Symposium: Human Behavior Modeling.

[6]  Daniel Gatica-Perez,et al.  Learning and predicting multimodal daily life patterns from cell phones , 2009, ICMI-MLMI '09.

[7]  Carlo Ratti,et al.  Taxi-Aware Map: Identifying and Predicting Vacant Taxis in the City , 2010, AmI.

[8]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[9]  Carlo Ratti,et al.  Mobile Landscapes: Using Location Data from Cell Phones for Urban Analysis , 2006 .

[10]  Shing Chung Josh Wong,et al.  Modeling the bilateral micro-searching behavior for urban taxi services using the absorbing Markov chain approach , 2005 .

[11]  Thad Starner,et al.  Using GPS to learn significant locations and predict movement across multiple users , 2003, Personal and Ubiquitous Computing.

[12]  Ramachandran Ramjee,et al.  Nericell: rich monitoring of road and traffic conditions using mobile smartphones , 2008, SenSys '08.

[13]  Carlo Ratti,et al.  Cellular Census: Explorations in Urban Data Collection , 2007, IEEE Pervasive Computing.

[14]  Liang Liu,et al.  Uncovering cabdrivers' behavior patterns from their digital traces , 2010, Comput. Environ. Urban Syst..

[15]  Carlo Ratti,et al.  Revealing Taxi Driver's Mobility Intelligence through His Trace , 2010 .