Exploring regularity and structure in travel behavior using Smart Card data

As the economic opportunities fostered by large cities become more diverse, the travel patterns of public transport users become more heterogeneous. From personalized customer information, to improved travel demand models, understanding these heterogeneous travel patterns is useful for a number of applications relevant to public transport agencies. This thesis explores how smart card data can be used to analyze and compare the structure of individual travel patterns observed over several weeks. Specifically, the way in which multiple journeys and activities are ordered and combined into repeated patterns, both by the same individual over time and across individuals is evaluated from the journey sequence of each user. The research is structured around three objectives. First, we introduce a representation of individual travel patterns and develop a measure of travel sequence regularity. The mobility of each individual is modeled as a stochastic process with memory, of which each new realization represents an activity or journey. Entropy rate, a measure of randomness in the stochastic process, is used to quantify repetition in the order of journeys and activities. This analysis reveals that the order of events is an important component of regularity not explicitly captured in previous literature. Second, we develop an approach to identify clusters of travel patterns with similar structure considered with respect to public transport usage and activity patterns. Finally, we present an exploratory evaluation of the associations between the identified clusters and socio-demographic characteristics by linking smart card data to an annual travel diary survey. These three objectives are considered in the context of a practical application using the transactions of a sample of approximately 100,000 users collected between February 10 and March 1

[1]  Michel Bierlaire,et al.  BIOGEME: a free package for the estimation of discrete choice models , 2003 .

[2]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  J. Rice Mathematical Statistics and Data Analysis , 1988 .

[4]  M. Wedel,et al.  Market Segmentation: Conceptual and Methodological Foundations , 1997 .

[5]  P. Waddell,et al.  Analysis of Lifestyle Choices: Neighborhood Type, Travel Patterns, and Activity Participation , 2002 .

[6]  Martha Harleman Characterizing Transit System Performance Using Smart Card Data , 2015 .

[7]  Alex Pentland,et al.  Reality mining: sensing complex social systems , 2006, Personal and Ubiquitous Computing.

[8]  Catherine Morency,et al.  Enhancing Household Travel Surveys Using Smart Card Data , 2009 .

[9]  Jonathon Shlens,et al.  A Tutorial on Principal Component Analysis , 2014, ArXiv.

[10]  Stuart M. Allen,et al.  Measuring Individual Regularity in Human Visiting Patterns , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[11]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[12]  Jason B. Gordon Intermodal passenger flows on London's public transport network : automated inference of full passenger journeys using fare-transaction and vehicle-location data , 2012 .

[13]  Kay W. Axhausen,et al.  Exploring Variation Properties of Departure Time Choice Behavior by Using Multilevel Analysis Approach , 2009 .

[14]  Matthew J. Roorda,et al.  Exploring spatial variety in patterns of activity-travel behaviour: initial results from the Toronto Travel-Activity Panel Survey (TTAPS) , 2008 .

[15]  Stefan Schönfelder,et al.  Urban rhythms: modelling the rhythms of individual travel behaviour , 2006 .

[16]  Torsten Hägerstraand WHAT ABOUT PEOPLE IN REGIONAL SCIENCE , 1970 .

[17]  Dinesh Gopinath,et al.  Travel demand model system for the information era , 1996 .

[18]  Yun Gao,et al.  Estimating the Entropy of Binary Time Series: Methodology, Some Theory and a Simulation Study , 2008, Entropy.

[19]  Nigel H. M. Wilson,et al.  The potential impact of automated data collection systems on urban public transport planning. , 2009 .

[20]  Margaret Martonosi,et al.  Identifying Important Places in People's Lives from Cellular Network Data , 2011, Pervasive.

[21]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[22]  Amar Mukherjee,et al.  The Burrows-Wheeler Transform:: Data Compression, Suffix Arrays, and Pattern Matching , 2008 .

[23]  Ilan Salomon,et al.  Life style as a factor in explaining travel behavior , 1980 .

[24]  Rabi G. Mishalani,et al.  Service Reliability Measurement Using Automated Fare Card Data , 2010 .

[25]  Bruno Agard,et al.  Measuring transit use variability with smart-card data , 2007 .

[26]  B. Prabhakar,et al.  INSINC: A Platform for Managing Peak Demand in Public Transit , 2013 .

[27]  Meisy A. Ortega-Tong Classification of London's public transport users using smart card data , 2013 .

[28]  Kay W. Axhausen,et al.  How routine is a routine?: An analysis of the day-to-day variability in prism vertex location , 2005 .

[29]  Peter Jones,et al.  Developments in dynamic and activity-based approaches to travel analysis , 1990 .

[30]  A. Pentland,et al.  Eigenbehaviors: identifying structure in routine , 2009, Behavioral Ecology and Sociobiology.

[31]  Jiawei Han,et al.  Mining event periodicity from incomplete observations , 2012, KDD.

[32]  Peter R. Stopher,et al.  Variability of Travel over Multiple Days , 2008 .

[33]  Chandra R. Bhat,et al.  A retrospective and prospective survey of time-use research , 1999 .

[34]  Xiaolei Ma,et al.  Mining smart card data for transit riders’ travel patterns , 2013 .

[35]  Matthew J. Roorda,et al.  Long- and short-term dynamics in activity scheduling: A structural equations approach , 2008 .

[36]  Khandker Nurul Habib,et al.  Modelling daily activity program generation considering within-day and day-to-day dynamics in activity-travel behaviour , 2008 .

[37]  Chandra R. Bhat,et al.  Incorporating Observed and Unobserved Heterogeneity in Urban Work Travel Mode Choice Modeling , 2000, Transp. Sci..

[38]  S. Srinivasan,et al.  An analysis of multiple interepisode durations using a unifying multivariate hazard model , 2005 .

[39]  Tijs Neutens,et al.  An analysis of day-to-day variations in individual space-time accessibility , 2012 .

[40]  Darren M. Scott,et al.  Exploring day-to-day variability in time use for household members , 2010 .

[41]  Joan L. Walker,et al.  Latent lifestyle preferences and household location decisions , 2007, J. Geogr. Syst..

[42]  Albert-László Barabási,et al.  Limits of Predictability in Human Mobility , 2010, Science.

[43]  Shan Jiang,et al.  Clustering daily patterns of human activities in the city , 2012, Data Mining and Knowledge Discovery.

[44]  Haris N. Koutsopoulos,et al.  User Behavior in Multiroute Bus Corridors , 2014 .

[45]  F. Koppelman,et al.  An examination of the determinants of day-to-day variability in individuals' urban travel behavior , 1986 .

[46]  Sanjeev R. Kulkarni,et al.  Universal entropy estimation via block sorting , 2004, IEEE Transactions on Information Theory.

[47]  Ka Kee Alfred Chu,et al.  Enriching Archived Smart Card Transaction Data for Transit Demand Modeling , 2008 .

[48]  Frans M. J. Willems,et al.  The Context-Tree Weighting Method : Extensions , 1998, IEEE Trans. Inf. Theory.

[49]  Susan Hanson,et al.  Repetition and Variability in Urban Travel , 2010 .

[50]  Anne Halvorsen,et al.  Improving transit demand management with Smart Card data : general framework and applications , 2015 .

[51]  Jianhong Wu,et al.  Data clustering - theory, algorithms, and applications , 2007 .

[52]  David Kotz,et al.  Periodic properties of user mobility and access-point popularity , 2007, Personal and Ubiquitous Computing.

[53]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[54]  M. Ben-Akiva,et al.  The Use of the Life-Style Concept in Travel Demand Models , 1983 .

[56]  Craig R. Rindt,et al.  The Activity-Based Approach , 2008 .

[57]  P. Stopher,et al.  Assessing the accuracy of the Sydney Household Travel Survey with GPS , 2007 .

[58]  Ka Kee Alfred Chu,et al.  Augmenting Transit Trip Characterization and Travel Behavior Comprehension , 2010 .

[59]  V. Kshirsagar,et al.  Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.

[60]  Le Minh Kieu,et al.  Passenger Segmentation Using Smart Card Data , 2015, IEEE Transactions on Intelligent Transportation Systems.

[61]  Wonjae Jang,et al.  Travel Time and Transfer Analysis Using Transit Smart Card Data , 2010 .

[62]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[63]  Juliea Morris,et al.  Statistics in Medicine: Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates , 1988, British medical journal.

[64]  Frank S. Koppelman,et al.  An examination of the determinants of day-to-day variability in individuals' urban travel behavior , 1986 .

[65]  Moshe Ben-Akiva,et al.  PII: S0965-8564(99)00043-9 , 2000 .

[66]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[67]  S. Hanson,et al.  Systematic variability in repetitious travel , 1988 .

[68]  P. Waddell Towards a behavioral integration of land use and transportation modelling , 2000 .

[69]  Eric I. Pas,et al.  INTRAPERSONAL VARIABILITY AND MODEL GOODNESS-OF-FIT , 1987 .

[70]  Susan Hanson,et al.  ASSESSING DAY-TO-DAY VARIABILITY IN COMPLEX TRAVEL PATTERNS , 1982 .

[71]  M. Kubát An Introduction to Machine Learning , 2017, Springer International Publishing.

[72]  Frans M. J. Willems,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[73]  Bruno Agard,et al.  Calculation of Transit Performance Measures Using Smartcard Data , 2009 .

[74]  E. I. Pas,et al.  Intrapersonal variability in daily urban travel behavior: Some additional evidence , 1995 .

[75]  Joachim Scheiner,et al.  The gendered complexity of daily life: Effects of life-course events on changes in activity entropy and tour complexity over time , 2014 .

[76]  Dimitri P. Bertsekas,et al.  Linear network optimization - algorithms and codes , 1991 .

[77]  Laura K. Riegel Utilizing automatically collected smart card data to enhance travel demand surveys , 2013 .