Travelers or locals? Identifying meaningful sub-populations from human movement data in the absence of ground truth

As users of mobile devices make phone calls, browse the web, or use an app, large volumes of data are routinely generated that are a potentially useful source for investigating human behavior in space. However, as such data are usually collected only as a by-product, they often lack stringent experimental design and ground truth, which makes interpretation and derivation of valid behavioral conclusions challenging. Here, we propose an unsupervised, data-driven approach to identify different user types based on high-resolution human movement data collected from a smartphone navigation app, in the absence of ground truth. We capture spatio-temporal footprints of users, characterized by meaningful summary statistics, which are then used in an unsupervised step to identify user types. Based on an extensive dataset of users of the mobile navigation app Sygic in Australia, we show how the proposed methodology allows to identify two distinct groups of users: ‘travelers’, visiting different areas with distinct, salient characteristics, and ‘locals’, covering shorter distances and revisiting many of their locations. We verify our approach by relating user types to space use: we find that travelers and locals prefer to visit distinct, different locations in the Australian cities Sydney and Melbourne, as suggested independently by other studies. Although we use high-resolution GPS data, the proposed methodology is potentially transferable to low-resolution movement data (e.g. Call Detail Records), since we rely only on summary statistics.

[1]  Mátyás Gede,et al.  Where Do Tourists Go?: Visualizing and Analysing the Spatial Distribution of Geotagged Photography , 2013, Cartogr. Int. J. Geogr. Inf. Geovisualization.

[2]  Zbigniew Smoreda,et al.  Everyday space–time geographies: using mobile phone-based sensor data to monitor urban activity in Harbin, Paris, and Tallinn , 2015, Int. J. Geogr. Inf. Sci..

[3]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[4]  Carlo Ratti,et al.  Understanding individual mobility patterns from urban sensing data: A mobile phone trace example , 2013 .

[5]  J. Saarinen Wilderness use, conservation and tourism: what do we protect and for and from whom? , 2016 .

[6]  Susmita Datta,et al.  Comparisons and validation of statistical clustering techniques for microarray gene expression data , 2003, Bioinform..

[7]  Yongli Ren,et al.  D-Log: A WiFi Log-based differential scheme for enhanced indoor localization with single RSSI source and infrequent sampling rate , 2017, Pervasive Mob. Comput..

[8]  Marta C. González,et al.  Understanding individual human mobility patterns , 2008, Nature.

[9]  Chaoming Song,et al.  Modelling the scaling properties of human mobility , 2010, 1010.0436.

[10]  I. Kelly Chapter 6 – Precincts Within the Urban Form: Relationships with the City , 2008 .

[11]  Fosca Giannotti,et al.  Mining mobility user profiles for car pooling , 2011, KDD.

[12]  B. Mckercher,et al.  Modeling Tourist Movements: A Local Destination Analysis , 2006 .

[13]  Maya Schuldiner,et al.  Corrigendum: OM14 is a mitochondrial receptor for cytosolic ribosomes that supports co-translational import into mitochondria , 2015, Nature Communications.

[14]  Guy N. Brock,et al.  clValid , an R package for cluster validation , 2008 .

[15]  Vasyl Pihur,et al.  RankAggreg, an R package for weighted rank aggregation , 2009, BMC Bioinformatics.

[16]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[17]  Robert Weibel,et al.  Revealing the physics of movement: Comparing the similarity of movement characteristics of different types of moving objects , 2009, Comput. Environ. Urban Syst..

[18]  Dino Pedreschi,et al.  Human mobility, social ties, and link prediction , 2011, KDD.

[19]  Yu Liu,et al.  Inferring trip purposes and uncovering travel patterns from taxi trajectory data , 2016 .

[20]  John L. Hall,et al.  Leisure Experiences in Tourist Attractions: Exploring the Motivations of Local Residents , 2006 .

[21]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[22]  Dino Pedreschi,et al.  Returners and explorers dichotomy in human mobility , 2015, Nature Communications.

[23]  Dale Neef,et al.  Digital Exhaust: What Everyone Should Know About Big Data, Digitization and Digitally Driven Innovation , 2014 .

[24]  Licia Capra,et al.  How smart is your smartcard?: measuring travel behaviours, perceptions, and incentives , 2011, UbiComp '11.

[25]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[26]  Carlo Ratti,et al.  The Geography of Taste: Analyzing Cell-Phone Mobility and Social Events , 2010, Pervasive.

[27]  Jian Pei,et al.  2012- Data Mining. Concepts and Techniques, 3rd Edition.pdf , 2012 .

[28]  Age K. Smilde,et al.  Principal Component Analysis , 2003, Encyclopedia of Machine Learning.

[29]  Josep Blat,et al.  Digital Footprinting: Uncovering Tourists with User-Generated Content , 2008, IEEE Pervasive Computing.

[30]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[31]  D. Edwards,et al.  Understanding tourists’ spatial behaviour: GPS tracking as an aid to sustainable destination management , 2013 .

[32]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[33]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[34]  Jae-Gil Lee,et al.  TraClass: trajectory classification using hierarchical region-based and trajectory-based clustering , 2008, Proc. VLDB Endow..

[35]  Robert Weibel,et al.  Movement similarity assessment using symbolic representation of trajectories , 2012, Int. J. Geogr. Inf. Sci..

[36]  Yasuo Asakura,et al.  Analysis of tourist behaviour based on the tracking data collected using a mobile communication instrument , 2007 .

[37]  Alex Pentland,et al.  Big Data and Management , 2014 .

[38]  N. Shoval,et al.  Mobility Research in the Age of the Smartphone , 2016 .

[39]  Ian H. Witten,et al.  Data Mining: Practical Machine Learning Tools and Techniques, 3/E , 2014 .

[40]  Balázs Csanád Csáji,et al.  Exploring the Mobility of Mobile Phone Users , 2012, ArXiv.

[41]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[42]  Dong Zhou,et al.  Translation techniques in cross-language information retrieval , 2012, CSUR.

[43]  R. Ahas,et al.  The use of tracking technologies in tourism research: the first decade , 2016 .

[44]  Yanchi Liu,et al.  Diagnosing New York city's noises with ubiquitous data , 2014, UbiComp.

[45]  Sune Lehmann,et al.  Understanding predictability and exploration in human mobility , 2016, EPJ Data Science.

[46]  Joseph Ferreira,et al.  Activity-Based Human Mobility Patterns Inferred from Mobile Phone Data: A Case Study of Singapore , 2017, IEEE Transactions on Big Data.

[47]  Stefano Spaccapietra,et al.  Semantic trajectories modeling and analysis , 2013, CSUR.

[48]  Fan Zhang,et al.  Exploring human mobility with multi-source data at extremely large metropolitan scales , 2014, MobiCom.

[49]  Tony Griffin,et al.  The precinct experience: a phenomenological approach , 2005 .

[50]  FROM THE EDITORS BIG DATA AND MANAGEMENT , 2014 .

[51]  Wen-Jing Hsu,et al.  Mining GPS data for mobility patterns: A survey , 2014, Pervasive Mob. Comput..

[52]  Zhaohui Wu,et al.  This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 1 Land-Use Classification Using Taxi GPS Traces , 2022 .

[53]  Carlo Ratti,et al.  Eigenplaces: Analysing Cities Using the Space–Time Structure of the Mobile Phone Network , 2009 .

[54]  Tracey J. Dickson,et al.  Understanding Tourism Experiences and Behaviour in Cities: An Australian Case Study , 2009 .

[55]  Dino Pedreschi,et al.  Understanding the patterns of car travel , 2013 .

[56]  Ryosuke Shibasaki,et al.  Activity-Aware Map: Identifying Human Daily Activity Pattern Using Mobile Phone Data , 2010, HBU.

[57]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[58]  Xing Xie,et al.  Learning transportation mode from raw gps data for geographic applications on the web , 2008, WWW.

[59]  Ling Yin,et al.  Understanding the bias of call detail records in human mobility research , 2016, Int. J. Geogr. Inf. Sci..