A Statistical Approach for Studying the Spatio-Temporal Distribution of Geolocated Tweets in Urban Environments

An in-depth descriptive approach to the dynamics of the urban population is fundamental as a first step towards promoting effective planning and designing processes in cities. Understanding the behavioral aspects of human activities can contribute to their effective management and control. We present a framework, based on statistical methods, for studying the spatio-temporal distribution of geolocated tweets as a proxy for where and when people carry out their activities. We have evaluated our proposal by analyzing the distribution of collected geolocated tweets over a two-week period in the summer of 2017 in Lisbon, London, and Manhattan. Our proposal considers a negative binomial regression analysis for the time series of counts of tweets as a first step. We further estimate a functional principal component analysis of second-order summary statistics of the hourly spatial point patterns formed by the locations of the tweets. Finally, we find groups of hours with a similar spatial arrangement of places where humans develop their activities through hierarchical clustering over the principal scores. Social media events are found to show strong temporal trends such as seasonal variation due to the hour of the day and the day of the week in addition to autoregressive schemas. We have also identified spatio-temporal patterns of clustering, i.e., groups of hours of the day that present a similar spatial distribution of human activities.

[1]  P. McCullagh,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[2]  Michael Jackson,et al.  SOCIAL SYSTEMS THEORY AND PRACTICE: THE NEED FOR A CRITICAL APPROACH , 1985 .

[3]  A. Cameron,et al.  Econometric models based on count data. Comparisons and applications of some estimators and tests , 1986 .

[4]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[5]  J. Schwartz,et al.  Short term effects of air pollution on health: a European approach using epidemiologic time series data: the APHEA protocol. , 1996, Journal of epidemiology and community health.

[6]  J. O. Ramsay,et al.  Functional Data Analysis (Springer Series in Statistics) , 1997 .

[7]  J. Hardin,et al.  Generalized Linear Models and Extensions , 2001 .

[8]  R. H. Myers Generalized Linear Models: With Applications in Engineering and the Sciences , 2001 .

[9]  J. Brian Gray,et al.  Introduction to Linear Regression Analysis , 2002, Technometrics.

[10]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[11]  Erica E. Benson,et al.  Principal Component Analysis for Spatial Point Processes — Assessing the Appropriateness of the Approach in an Ecological Context , 2006 .

[12]  T. Bailey Spatial Analysis: A Guide for Ecologists , 2006 .

[13]  T. Geisel,et al.  The scaling laws of human travel , 2006, Nature.

[14]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[15]  Marta C. González,et al.  Understanding individual human mobility patterns , 2008, Nature.

[16]  D. Stoyan,et al.  Statistical Analysis and Modelling of Spatial Point Patterns , 2008 .

[17]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[18]  A. Vespignani Predicting the Behavior of Techno-Social Systems , 2009, Science.

[19]  M. C. Borja,et al.  An Introduction to Generalized Linear Models , 2009 .

[20]  Michael Batty,et al.  WORKING PAPERS SERIES Cities as Complex Systems : Scaling , Interactions , Networks , Dynamics and Urban Morphologies , 2008 .

[21]  Diane J. Cook,et al.  Human Activity Recognition and Pattern Discovery , 2010, IEEE Pervasive Computing.

[22]  A. Kaplan,et al.  Users of the world, unite! The challenges and opportunities of Social Media , 2010 .

[23]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[24]  Albert-László Barabási,et al.  Limits of Predictability in Human Mobility , 2010, Science.

[25]  Kazutoshi Sumiya,et al.  Crowd-based urban characterization: extracting crowd behavioral patterns in urban areas from Twitter , 2011, LBSN '11.

[26]  Franco Zambonelli,et al.  Extracting urban patterns from location-based social networks , 2011, LBSN '11.

[27]  Marta C. González,et al.  A universal model for mobility and migration patterns , 2011, Nature.

[28]  Shan Jiang,et al.  Clustering daily patterns of human activities in the city , 2012, Data Mining and Knowledge Discovery.

[29]  Haavard Rue,et al.  A toolbox for fitting complex spatial point process models using integrated nested Laplace approximation (INLA) , 2012, 1301.1817.

[30]  Jussara M. Almeida,et al.  Social Media as a Source of Sensing to Study City Dynamics and Urban Social Behavior: Approaches, Models, and Opportunities , 2012, MSM/MUSE.

[31]  Víctor Soto,et al.  Characterizing Urban Landscapes Using Geolocated Tweets , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[32]  Peter J. Diggle,et al.  Statistical Analysis of Spatial and Spatio-Temporal Point Patterns , 2013 .

[33]  Huan Liu,et al.  Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose , 2013, ICWSM.

[34]  Satish V. Ukkusuri,et al.  Understanding urban human activity and mobility patterns using large-scale location-based data from online social media , 2013, UrbComp '13.

[35]  Licia Capra,et al.  Urban Computing: Concepts, Methodologies, and Applications , 2014, TIST.

[36]  Huan Liu,et al.  Data Analysis on Location-Based Social Networks , 2014 .

[37]  Leonhard Held,et al.  Spatio-Temporal Analysis of Epidemic Phenomena Using the R Package surveillance , 2014, ArXiv.

[38]  Carlo Ratti,et al.  Geo-located Twitter as proxy for global mobility patterns , 2013, Cartography and geographic information science.

[39]  Tao Cheng,et al.  Event Detection using Twitter: A Spatio-Temporal Approach , 2014, PloS one.

[40]  Jason I. Hong,et al.  Using Social Media Data to Understand Cities , 2014 .

[41]  Vanessa Frías-Martínez,et al.  Spectral clustering for sensing urban land use using Twitter activity , 2014, Engineering applications of artificial intelligence.

[42]  E. Pebesma,et al.  Classes and Methods for Spatial Data , 2015 .

[43]  H. Rue,et al.  Spatial Data Analysis with R-INLA with Some Extensions , 2015 .

[44]  Carol L. Stimmel,et al.  Building Smart Cities: Analytics, ICT, and Design Thinking , 2015 .

[45]  Murtaza Haider,et al.  Beyond the hype: Big data concepts, methods, and analytics , 2015, Int. J. Inf. Manag..

[46]  M. Cameletti,et al.  Spatial and Spatio-temporal Bayesian Models with R - INLA , 2015 .

[47]  Alexander Zipf,et al.  An Advanced Systematic Literature Review on Spatiotemporal Analyses of Twitter Data , 2015, Trans. GIS.

[48]  D-J Lee,et al.  Spatio‐temporal functional data analysis for wireless sensor networks data , 2015, Environmetrics.

[49]  Alexander Zipf,et al.  A geographic approach for combining social media and authoritative data towards identifying useful information for disaster management , 2015, Int. J. Geogr. Inf. Sci..

[50]  Alexander Zipf,et al.  Twitter as an indicator for whereabouts of people? Correlating Twitter with UK census data , 2015, Comput. Environ. Urban Syst..

[51]  Yan Shi,et al.  A Framework for Discovering Evolving Domain Related Spatio-Temporal Patterns in Twitter , 2016, ISPRS Int. J. Geo Inf..

[52]  Qunying Huang,et al.  Activity patterns, socioeconomic status and urban spatial structure: what can social media data tell us? , 2016, Int. J. Geogr. Inf. Sci..

[53]  Hiroki Sayama,et al.  Visualizing the "heartbeat" of a city with tweets , 2014, Complex..

[54]  Shaowen Wang,et al.  Exploring Multi-Scale Spatiotemporal Twitter User Mobility Patterns with a Visual-Analytics Approach , 2016, ISPRS Int. J. Geo Inf..

[55]  Marie Frei,et al.  Functional Data Analysis With R And Matlab , 2016 .

[56]  Virgilio Gómez-Rubio,et al.  Spatial Point Patterns: Methodology and Applications with R , 2016 .

[57]  M. Strube,et al.  Citizen-Centric Urban Planning through Extracting Emotion Information from Twitter in an Interdisciplinary Space-Time-Linguistics Algorithm , 2016 .

[58]  Alexander Zipf,et al.  Exploration of spatiotemporal and semantic clusters of Twitter data using unsupervised neural networks , 2016, Int. J. Geogr. Inf. Sci..

[59]  Wei Huang,et al.  Understanding human activity patterns based on space-time-semantics , 2016 .

[60]  Michael Mathioudakis,et al.  Modeling Urban Behavior by Mining Geotagged Social Data , 2017, IEEE Transactions on Big Data.

[61]  A. Soliman,et al.  Social sensing of urban land use based on analysis of Twitter users’ mobility patterns , 2017, PloS one.

[62]  Forrest R. Stevens,et al.  Improving Large Area Population Mapping Using Geotweet Densities , 2016, Trans. GIS.

[63]  Pilvi Nummi,et al.  Social Media Data Analysis in Urban e-Planning , 2017 .

[64]  Qunying Huang,et al.  Mining online footprints to predict user’s next location , 2017, Int. J. Geogr. Inf. Sci..

[65]  Hao Zhang,et al.  Identifying Data Noises, User Biases, and System Errors in Geo-tagged Twitter Messages (Tweets) , 2017, ArXiv.

[66]  Roland Fried,et al.  tscount: An R package for analysis of count time series following generalized linear models , 2017 .

[67]  Alex 'Sandy' Pentland,et al.  Modeling and Understanding Intrinsic Characteristics of Human Mobility , 2018, Handbook of Mobile Data Privacy.

[68]  C. Havas,et al.  Combining machine-learning topic models and spatiotemporal analysis of social media data for disaster footprint and damage assessment , 2018 .

[69]  Alyson G. Wilson,et al.  Twitter Geolocation , 2018, ACM Trans. Knowl. Discov. Data.

[70]  Budhendra L. Bhaduri,et al.  Utilizing Geo-located Sensors and Social Media for Studying Population Dynamics and Land Classification , 2018 .

[71]  A. Condeço-Melhorado,et al.  City dynamics through Twitter: Relationships between land use and spatiotemporal demographics , 2018 .

[72]  Jie Shan,et al.  Spatial-Temporal Event Detection from Geo-Tagged Tweets , 2018, ISPRS Int. J. Geo Inf..