Human Activity Analytics Based on Mobility and Social Media Data

The development of social networks such as Twitter, Facebook and Google+ allow users to share their beliefs, feelings, or observations with their circles of friends. Based on these data, a range of applications and techniques has been developed, targeting to provide a better quality of life to the users. Nevertheless, the quality of results of the geolocationaware applications is signicantly restricted due to the tiny percentage of the social media data that is geotagged ( 2% for Twitter). Hence, increasing this percentage is an important and challenging problem. Moreover, information extracted from social media data can be complemented by the analysis of mobile phone usage data, in order to provide further insights on human activity patterns. In this thesis, we present a novel method for analyzing and geolocalizing non-geotagged Twitter posts. The proposed method is the rst to do so at the ne-grain of city neighborhoods, while being both eective and time ecient. Our method is based on the extraction of representative keywords for each candidate location,as well as the analysis of the tweet volume time series. We also describe a system built on top of our method, which geolocalizes tweets and allows users to visually examine the results and their evolution over time. Our system allows the user to get a better idea of how the activity of a particular location changes, which the most important keywords are, as well as to geolocalize individual tweets of interest. Moreover, we study the activity and mobility characteristics of the users that post geotagged tweets and compared the mobility of users who attended the event with a random set of users. Interestingly, the results of this analysis indicate that a very small number of users (i.e., less than 35 users in this study) is able to represent the mobility patterns present in the entire dataset. Finally, we study the call activity and mobility patterns, clustering the observed behaviors that exhibited similar characteristics, and characterizing the anomalous behaviors. We analyzed a Call Detail Record (CDR) dataset, containing (aggregated) information on the calls among mobile phones. Employing density-based algorithms and statistical analysis, we developed a framework that identies abnormal locations, as well abnormal time intervals. The results of this work can be used for early identication of exceptional situations, monitoring the eects of important events in urban and transportation planning, and others.

[1]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[2]  Alex Pentland,et al.  Reality mining: sensing complex social systems , 2006, Personal and Ubiquitous Computing.

[3]  William G. Griswold,et al.  Mobility Detection Using Everyday GSM Traces , 2006, UbiComp.

[4]  Carlo Ratti,et al.  Mobile Landscapes: Graz in Real Time , 2007, Location Based Services and TeleCartography.

[5]  Dieter Fox,et al.  Location-Based Activity Recognition , 2005, KI.

[6]  A-L Barabási,et al.  Structure and tie strengths in mobile communication networks , 2006, Proceedings of the National Academy of Sciences.

[7]  Naranker Dulay,et al.  TRAcME: Temporal Activity Recognition Using Mobile Phone Data , 2008, 2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing.

[8]  G. Madey,et al.  Uncovering individual and collective human dynamics from mobile phone records , 2007, 0710.2939.

[9]  Ezra Haber Glenn,et al.  The Ghost Map: The Story of London's Most Terrifying Epidemic—and How It Changed Science, Cities, and the Modern World , 2008 .

[10]  Stefan Wuchty What is a social tie? , 2009, Proceedings of the National Academy of Sciences.

[11]  Pavel Serdyukov,et al.  Placing flickr photos on a map , 2009, SIGIR.

[12]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[13]  Carlo Ratti,et al.  The Geography of Taste: Analyzing Cell-Phone Mobility and Social Events , 2010, Pervasive.

[14]  Brendan T. O'Connor,et al.  A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[15]  Ryosuke Shibasaki,et al.  Activity-Aware Map: Identifying Human Daily Activity Pattern Using Mobile Phone Data , 2010, HBU.

[16]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[17]  Themis Palpanas,et al.  Scalable discovery of contradictions on the web , 2010, WWW '10.

[18]  S. Strogatz,et al.  Redrawing the Map of Great Britain from a Network of Human Interactions , 2010, PloS one.

[19]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[20]  Vanessa Murdock,et al.  Your mileage may vary: on the limits of social media , 2011, SIGSPACIAL.

[21]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[22]  Sharon Myrtle Paradesi,et al.  Geotagging Tweets Using Their Content , 2011, FLAIRS.

[23]  Carlo Ratti,et al.  Real-Time Urban Monitoring Using Cell Phones: A Case Study in Rome , 2011, IEEE Transactions on Intelligent Transportation Systems.

[24]  M. A. Azam,et al.  Human Behaviour Detection Using GSM Location Patterns and Bluetooth Proximity Data , 2011 .

[25]  Albert-László Barabási,et al.  Collective Response of Human Populations to Large-Scale Emergencies , 2011, PloS one.

[26]  Themis Palpanas,et al.  Survey on mining subjective data on the web , 2011, Data Mining and Knowledge Discovery.

[27]  Scalable Detection of Sentiment-Based Contradictions , 2011 .

[28]  Sheila Kinsella,et al.  "I'm eating a sandwich in Glasgow": modeling locations with tweets , 2011, SMUC '11.

[29]  Dongwon Lee,et al.  @Phillies Tweeting from Philly? Predicting Twitter User Locations with Spatial Word Usage , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[30]  K. Allan Classifiers , 2015 .

[31]  Steven Schockaert,et al.  Using social media to find places of interest: a case study , 2012, GEOCROWD '12.

[32]  Ehsan Kazemi,et al.  Been There, Done That: What Your Mobility Traces Reveal about Your Behavior , 2012 .

[33]  Kwan-Liu Ma,et al.  Inferring human mobility patterns from anonymized mobile communication usage , 2012, MoMM '12.

[34]  Jingjing Wang,et al.  Periodicity Based Next Place Prediction , 2012 .

[35]  Michiaki Tatsubori,et al.  Location inference using microblog messages , 2012, WWW.

[36]  D. Culibrk,et al.  Demographic Attributes Prediction on the Real-World Mobile Data , 2012 .

[37]  Michelle R. Guy,et al.  Twitter earthquake detection: earthquake monitoring in a social world , 2012 .

[38]  Vincent S. Tseng,et al.  Mining Users' Behaviors and Environments for Semantic Place Prediction , 2012 .

[39]  Jiliang Tang,et al.  Mobile Location Prediction in Spatio-Temporal Context , 2012 .

[40]  Víctor Soto,et al.  Characterizing Urban Landscapes Using Geolocated Tweets , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[41]  Chiara Renso,et al.  Identifying users profiles from mobile calls habits , 2012, UrbComp '12.

[42]  Peter Nijkamp,et al.  Mobile phone data from GSM networks for traffic parameter and urban spatial pattern assessment: a review of applications and opportunities , 2011, GeoJournal.

[43]  Cyrus Shahabi,et al.  Crowd sensing of traffic anomalies based on human mobility and social media , 2013, SIGSPATIAL/GIS.

[44]  Daniel Gatica-Perez,et al.  From Foursquare to My Square: Learning Check-in Behavior from Multiple Sources , 2013, ICWSM.

[45]  Anthony Stefanidis,et al.  #Earthquake: Twitter as a Distributed Sensor System , 2013, Trans. GIS.

[46]  Freddy Lécué,et al.  Westland row why so slow?: fusing social media and linked data sources for understanding real-time traffic conditions , 2013, IUI '13.

[47]  Satish V. Ukkusuri,et al.  Understanding urban human activity and mobility patterns using large-scale location-based data from online social media , 2013, UrbComp '13.

[48]  Sihem Amer-Yahia,et al.  Efficient sentiment correlation for large-scale demographics , 2013, SIGMOD '13.

[49]  Michalis Vazirgiannis,et al.  Clustering and Community Detection in Directed Networks: A Survey , 2013, ArXiv.

[50]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[51]  Yiannis Gkoufas,et al.  SaferCity: A System for Detecting and Analyzing Incidents from Social Media , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[52]  Emanuele Della Valle,et al.  Social Listening of City Scale Events Using the Streaming Linked Data Framework , 2013, SEMWEB.

[53]  Max Mühlhäuser,et al.  A Multi-Indicator Approach for Geolocalization of Tweets , 2013, ICWSM.

[54]  Michael Gertz,et al.  EvenTweet: Online Localized Event Detection from Twitter , 2013, Proc. VLDB Endow..

[55]  P. Paraskevopoulos,et al.  Identification and Characterization of Human Behavior Patterns from Mobile Phone Data , 2013 .

[56]  Antonio Lima,et al.  Exploiting Cellular Data for Disease Containment and Information Campaigns Strategies in Country-Wide Epidemics , 2013, ArXiv.

[57]  Shaowen Wang,et al.  Mapping the global Twitter heartbeat: The geography of Twitter , 2013, First Monday.

[58]  Nadia Magnenat-Thalmann,et al.  Who, where, when and what: discover spatio-temporal topics for twitter users , 2013, KDD.

[59]  Timothy Baldwin,et al.  Text-Based Twitter User Geolocation Prediction , 2014, J. Artif. Intell. Res..

[60]  Themis Palpanas,et al.  Dynamics of news events and social media reaction , 2014, KDD.

[61]  Marco Fiore,et al.  Classifying call profiles in large-scale mobile traffic datasets , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[62]  Dimitrios M. Thilikos,et al.  CoreCluster: A Degeneracy Based Graph Clustering Framework , 2014, AAAI.

[63]  Chenliang Li,et al.  Fine-grained location extraction from tweets with temporal awareness , 2014, SIGIR.

[64]  Themis Palpanas,et al.  NIA: System for News Impact Analytics , 2014 .

[65]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[66]  Fragkiskos D. Malliaros,et al.  Graph-based term weighting for text categorization , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[67]  Weiru Liu,et al.  A survey of location inference techniques on Twitter , 2015, J. Inf. Sci..

[68]  Reza Zafarani,et al.  Evaluation without ground truth in social media research , 2015, Commun. ACM.

[69]  Themis Palpanas,et al.  Fine-grained geolocalisation of non-geotagged tweets , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[70]  Luke S Sloan,et al.  Who Tweets with Their Location? Understanding the Relationship between Demographic Characteristics and the Use of Geoservices and Geotagging on Twitter , 2015, PloS one.

[71]  Jiebo Luo,et al.  Precise Localization of Homes and Activities: Detecting Drinking-While-Tweeting Patterns in Communities , 2016, ICWSM.

[72]  Themis Palpanas,et al.  When a Tweet Finds its Place: Fine-Grained Tweet Geolocalisation , 2016, SoGood@ECML-PKDD.