You are how you travel: A multi-task learning framework for Geodemographic inference using transit smart card data

Abstract Geodemographics, providing the information of population's characteristics in the regions on a geographical basis, is of immense importance in urban studies, public policy-making, social research and business, among others. Such data, however, are difficult to collect from the public, which is usually done via census, with a low update frequency. In urban areas, with the increasing prevalence of public transit equipped with automated fare payment systems, researchers can collect massive transit smart card (SC) data from a large population. The SC data record human daily activities at an individual level with high spatial and temporal resolutions. It can reveal frequent activity areas (e.g., residential areas) and travel behaviours of passengers that are intimately intertwined with personal interests and characteristics. This provides new opportunities for geodemographic study. This paper seeks to develop a framework to infer travellers' demographics (such as age, income level and car ownership, et al.) and their residential areas for geodemographic mapping using SC data with a household survey. We first use a decision tree diagram to detect passengers' residential areas. We then represent each individual's spatio-temporal activity pattern derived from multi-week SC data as a 2D image. Leveraging this representation, a multi-task convolutional neural network (CNN) is employed to predict multiple demographics of individuals from the images. Combing the demographics and locations of their residence, geodemographic information is further obtained. The methodology is applied to a large-scale SC dataset provided by Transport for London. Results provide new insights in understanding the relationship between human activity patterns and demographics. To the best of our knowledge, this is the first attempt to infer geodemographics by using the SC data.

[1]  Michel Verleysen,et al.  Clustering Smart Card Data for Urban Mobility Analysis , 2017, IEEE Transactions on Intelligent Transportation Systems.

[2]  Yu Zhang,et al.  Trip purpose identification using pairwise constraints based semi-supervised clustering , 2019 .

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  Anu Kristiina Siren,et al.  PRIVATE CAR AS THE GRAND EQUALISER? DEMOGRAPHIC FACTORS AND MOBILITY IN FINNISH MEN AND WOMEN AGED 65+ , 2004 .

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Lei Zhu,et al.  Prediction of Individual Social-Demographic Role Based on Travel Behavior Variability Using Long-Term GPS Data , 2017 .

[8]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[9]  Tao Cheng,et al.  A high-precision heuristic model to detect home and work locations from smart card data , 2018, Geo spatial Inf. Sci..

[10]  Yicheng Zhang,et al.  Computational socioeconomics , 2019, Physics Reports.

[11]  Pengfei Wang,et al.  Multi-task Representation Learning for Demographic Prediction , 2016, ECIR.

[12]  Nitesh V. Chawla,et al.  User Modeling on Demographic Attributes in Big Mobile Social Networks , 2017, ACM Trans. Inf. Syst..

[13]  Harry Timmermans,et al.  Using metro smart card data to model location choice of after-work activities : An application to Shanghai , 2017 .

[14]  Xiaoming Fu,et al.  Estimating Socioeconomic Status via Temporal-Spatial Mobility Analysis - A Case Study of Smart Card Data , 2019, 2019 28th International Conference on Computer Communication and Networks (ICCCN).

[15]  Zhaohui Wu,et al.  Mining User Attributes Using Large-Scale APP Lists of Smartphones , 2017, IEEE Systems Journal.

[16]  Haris N. Koutsopoulos,et al.  Inferring patterns in the multi-week activity sequences of public transport users , 2016 .

[17]  Peter White,et al.  The Potential of Public Transport Smart Card Data , 2005 .

[18]  Wei Wu,et al.  Predicting Home and Work Locations Using Public Transport Smart Card Data by Spectral Analysis , 2015, 2015 IEEE 18th International Conference on Intelligent Transportation Systems.

[19]  Yang Zhang,et al.  A Deep Learning Approach to Infer Employment Status of Passengers by Using Smart Card Data , 2020, IEEE Transactions on Intelligent Transportation Systems.

[20]  Jiangping Zhou,et al.  A novel excess commuting framework: Considering commuting efficiency and equity simultaneously , 2019, Environment and Planning B: Urban Analytics and City Science.

[21]  Jean-Claude Thill,et al.  Combining smart card data and household travel survey to analyze jobs-housing relationships in Beijing , 2013, Comput. Environ. Urban Syst..

[22]  Liu Yang,et al.  Inferring demographics from human trajectories and geographical context , 2019, Comput. Environ. Urban Syst..

[23]  Steven M. Bellovin,et al.  "I don't have a photograph, but you can have my footprints.": Revealing the Demographics of Location Data , 2015, ICWSM.

[24]  Junbo Zhang,et al.  Flow Prediction in Spatio-Temporal Networks Based on Multitask Deep Learning , 2020, IEEE Transactions on Knowledge and Data Engineering.

[25]  Nicholas Jing Yuan,et al.  You Are Where You Go: Inferring Demographic Attributes from Location Check-ins , 2015, WSDM.

[26]  Kun Xie,et al.  A novel residual graph convolution deep learning model for short-term network-based traffic forecasting , 2019, Int. J. Geogr. Inf. Sci..

[27]  Tao Cheng,et al.  EXPLORING THE RELATIONSHIP BETWEEN TRAVEL PATTERN AND SOCIAL-DEMOGRAPHICS USING SMART CARD DATA AND HOUSEHOLD SURVEY , 2019, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

[28]  T. Cheng,et al.  Inferring Social-Demographics of Travellers based on Smart Card Data , 2018, Proceedings of the 2nd International Conference on Advanced Research Methods and Analytics (CARMA 2018).

[29]  David J. Martin,et al.  Origin-destination geodemographics for analysis of travel to work flows , 2018, Comput. Environ. Urban Syst..

[30]  David Carmel,et al.  The Demographics of Mail Search and their Application to Query Suggestion , 2017, WWW.

[31]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[32]  Harilaos N. Koutsopoulos,et al.  Automated Inference of Linked Transit Journeys in London Using Fare-Transaction and Vehicle Location Data , 2013 .

[33]  E. Ziegel,et al.  Bootstrapping: A Nonparametric Approach to Statistical Inference , 1993 .

[34]  Qiang Yang,et al.  User demographics prediction based on mobile data , 2013, Pervasive Mob. Comput..

[35]  Tao Cheng,et al.  A framework for identifying activity groups from individual space-time profiles , 2016, Int. J. Geogr. Inf. Sci..

[36]  Tao Cheng,et al.  A graph deep learning method for short‐term traffic forecasting on large road networks , 2019, Comput. Aided Civ. Infrastructure Eng..

[37]  Xiaolei Ma,et al.  Mining smart card data for transit riders’ travel patterns , 2013 .

[38]  Ta Theo Arentze,et al.  A path analysis of social networks, telecommunication and social activity–travel patterns , 2013 .

[39]  Seth E. Spielman,et al.  The Past, Present, and Future of Geodemographic Research in the United States and United Kingdom , 2014, The Professional geographer : the journal of the Association of American Geographers.

[40]  Soumya K. Ghosh,et al.  Modeling of Human Movement Behavioral Knowledge from GPS Traces for Categorizing Mobile Users , 2017, WWW.

[41]  J. Evans Straightforward Statistics for the Behavioral Sciences , 1995 .

[42]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[43]  Hua Li,et al.  Demographic prediction based on user's browsing behavior , 2007, WWW '07.

[44]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[45]  Steven Skiena,et al.  Exact Age Prediction in Social Networks , 2015, WWW.

[46]  Torsten Hägerstraand WHAT ABOUT PEOPLE IN REGIONAL SCIENCE , 1970 .

[47]  James Haworth,et al.  Who you are is how you travel: A framework for transportation mode detection using individual and environmental characteristics , 2017 .

[48]  Tao Cheng,et al.  Understanding public transit patterns with open geodemographics to facilitate public transport planning , 2018, Transportmetrica A: Transport Science.