Inferring patterns in the multi-week activity sequences of public transport users

Abstract The public transport networks of dense cities such as London serve passengers with widely different travel patterns. In line with the diverse lives of urban dwellers, activities and journeys are combined within days and across days in diverse sequences. From personalized customer information, to improved travel demand models, understanding this type of heterogeneity among transit users is relevant to a number of applications core to public transport agencies’ function. In this study, passenger heterogeneity is investigated based on a longitudinal representation of each user’s multi-week activity sequence derived from smart card data. We propose a methodology leveraging this representation to identify clusters of users with similar activity sequence structure. The methodology is applied to a large sample ( n  = 33,026) from London’s public transport network, in which each passenger is represented by a continuous 4-week activity sequence. The application reveals 11 clusters, each characterized by a distinct sequence structure. Socio-demographic information available for a small sample of users ( n  = 1973) is combined to smart card transactions to analyze associations between the identified patterns and demographic attributes including passenger age, occupation, household composition and income, and vehicle ownership. The analysis reveals that significant connections exist between the demographic attributes of users and activity patterns identified exclusively from fare transactions.

[1]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Ka Kee Alfred Chu,et al.  Augmenting Transit Trip Characterization and Travel Behavior Comprehension , 2010 .

[3]  Yasuo Asakura,et al.  Behavioural data mining of transit smart card data: A data fusion approach , 2014 .

[4]  Gabriel Goulet-Langlois,et al.  Exploring regularity and structure in travel behavior using Smart Card data , 2015 .

[5]  Catherine Morency,et al.  Smart card data use in public transit: A literature review , 2011 .

[6]  K. Axhausen,et al.  Observing the rhythms of daily life , 2000 .

[7]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[8]  E. Côme,et al.  Understanding Passenger Patterns in Public Transit Through Smart Card and Socioeconomic Data: A case study in Rennes, France , 2014 .

[9]  S. Hanson,et al.  Systematic variability in repetitious travel , 1988 .

[10]  Magdalena Szumilas Explaining odds ratios. , 2010, Journal of the Canadian Academy of Child and Adolescent Psychiatry = Journal de l'Academie canadienne de psychiatrie de l'enfant et de l'adolescent.

[11]  陈菲菲 Travel in London , 2011 .

[12]  A. Pentland,et al.  Eigenbehaviors: identifying structure in routine , 2009, Behavioral Ecology and Sociobiology.

[13]  Bruno Agard,et al.  Measuring transit use variability with smart-card data , 2007 .

[14]  Xiaolei Ma,et al.  Mining smart card data for transit riders’ travel patterns , 2013 .

[15]  Meisy A. Ortega-Tong Classification of London's public transport users using smart card data , 2013 .

[16]  Torsten Hägerstraand WHAT ABOUT PEOPLE IN REGIONAL SCIENCE , 1970 .

[17]  Le Minh Kieu,et al.  Passenger Segmentation Using Smart Card Data , 2015, IEEE Transactions on Intelligent Transportation Systems.

[18]  Harilaos N. Koutsopoulos,et al.  Automated Inference of Linked Transit Journeys in London Using Fare-Transaction and Vehicle Location Data , 2013 .

[19]  Marcela Munizaga,et al.  Estimation of a disaggregate multimodal public transport Origin-Destination matrix from passive smartcard data from Santiago, Chile , 2012 .

[20]  Shan Jiang,et al.  Clustering daily patterns of human activities in the city , 2012, Data Mining and Knowledge Discovery.

[21]  Mark Hickman,et al.  Trip purpose inference using automated fare collection data , 2014, Public Transp..

[22]  Michel Bierlaire,et al.  BIOGEME: a free package for the estimation of discrete choice models , 2003 .

[23]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[24]  Anne Halvorsen,et al.  Improving transit demand management with Smart Card data : general framework and applications , 2015 .