Assessing spatiotemporal predictability of LBSN: a case study of three Foursquare datasets

Location-based social networks (LBSN) have provided new possibilities for researchers to gain knowledge about human spatiotemporal behavior, and to make predictions about how people might behave through space and time in the future. An important requirement of successfully utilizing LBSN in these regards is a thorough understanding of the respective datasets, including their inherent potential as well as their limitations. Specifically, when it comes to predictions, we must know what we can actually expect from the data, and how we could maximize their usefulness. Yet, this knowledge is still largely lacking from the literature. Hence, this work explores one particular aspect which is the theoretical predictability of LBSN datasets. The uncovered predictability is represented with an interval. The lower bound of the interval corresponds to the amount of regular behaviors that can easily be anticipated, and represents the correct predication rate that any algorithm should be able to achieve. The upper bound corresponds to the amount of information that is contained in the dataset, and represents the maximum correct prediction rate that cannot be exceeded by any algorithms. Three Foursquare datasets from three American cities are studied as an example. It is found that, within our investigated datasets, the lower bound of predictability of the human spatiotemporal behavior is 27%, and the upper bound is 92%. Hence, the inherent potentials of the dataset for predicting human spatiotemporal behavior are clarified, and the revealed interval allows a realistic assessment of the quality of predictions and thus of associated algorithms. Additionally, in order to provide further insight into the practical use of the dataset, the relationship between the predictability and the check-in frequencies are investigated from three different perspectives. It was found that the individual perspective provides no significant correlations between the predictability and the check-in frequency. In contrast, the same two quantities are found to be negatively correlated from temporal and spatial perspectives. Our study further indicates that the heavily frequented contexts and some extraordinary geographic features such as airports could be good starting points for effective improvements of prediction algorithms. In general, this research provides novel knowledge regarding the nature of the LBSN dataset and practical insights for a more reasonable utilization of the dataset.

[1]  Zbigniew Smoreda,et al.  The interplay between telecommunications and face-to-face interactions - an initial study using mobile phone data , 2011, ArXiv.

[2]  Michael Gertz,et al.  A probablistic model for spatio-temporal signal extraction from social media , 2013, SIGSPATIAL/GIS.

[3]  Chunyan Miao,et al.  Personalized point-of-interest recommendation by mining users' preference transition , 2013, CIKM.

[4]  Zhi Jun Li,et al.  LBSN-Based Personalized Routes Recommendation , 2014 .

[5]  Kazutoshi Sumiya,et al.  Measuring geographical regularities of crowd behaviors for Twitter-based geo-social event detection , 2010, LBSN '10.

[6]  Yong Gao,et al.  Uncovering Patterns of Inter-Urban Trip and Spatial Interaction from Social Media Check-In Data , 2013, PloS one.

[7]  Trevor Cohn,et al.  Mining user behaviours: a study of check-in patterns in location based social networks , 2013, WebSci.

[8]  Henriette Cramer,et al.  Performing a check-in: emerging practices, norms and 'conflicts' in location-sharing using foursquare , 2011, Mobile HCI.

[9]  Daniel Gatica-Perez,et al.  A probabilistic kernel method for human mobility prediction with smartphones , 2015, Pervasive Mob. Comput..

[10]  Nicu Sebe,et al.  Challenges of Human Behavior Understanding , 2010, HBU.

[11]  Ana María Munar,et al.  Digital Exhibitionism The Age of Exposure , 2010 .

[12]  E. B. Wilson Probable Inference, the Law of Succession, and Statistical Inference , 1927 .

[13]  Ling Chen,et al.  A context-aware personalized travel recommendation system based on geotagged social media data mining , 2013, Int. J. Geogr. Inf. Sci..

[14]  Krzysztof Janowicz,et al.  A Thematic Approach to User Similarity Built on Geosocial Check-ins , 2013, AGILE Conf..

[15]  Tomoharu Iwata,et al.  Travel route recommendation using geotagged photos , 2012, Knowledge and Information Systems.

[16]  Stefano Spaccapietra,et al.  Semantic trajectories modeling and analysis , 2013, CSUR.

[17]  Amit P. Sheth,et al.  Citizen Sensing, Social Signals, and Enriching Human Experience , 2009, IEEE Internet Computing.

[18]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[19]  Yeran Sun,et al.  Contextualized Relevance Evaluation of Geographic Information for Mobile Users in Location-Based Social Networks , 2015, ISPRS Int. J. Geo Inf..

[20]  Wolfgang Wörndl,et al.  Context-Aware Recommender Systems in Mobile Scenarios , 2009, Int. J. Inf. Technol. Web Eng..

[22]  Albert-László Barabási Human Dynamics: From Human Mobility to Predictability , 2011, ECML/PKDD.

[23]  M. Goodchild Citizens as sensors: web 2.0 and the volunteering of geographic information , 2007 .

[24]  Alexander Zipf,et al.  A geographic approach for combining social media and authoritative data towards identifying useful information for disaster management , 2015, Int. J. Geogr. Inf. Sci..

[25]  Yulong Gu,et al.  Fast Routing in Location-Based Social Networks Leveraging Check-in Data , 2014, 2014 IEEE International Conference on Internet of Things(iThings), and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom).

[26]  Joep Crompvoets,et al.  Geographic Information Science at the Heart of Europe , 2013, AGILE Conf..

[27]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[28]  Alexander Zipf,et al.  Twitter as an indicator for whereabouts of people? Correlating Twitter with UK census data , 2015, Comput. Environ. Urban Syst..

[29]  Krzysztof Janowicz,et al.  How where is when? On the regional variability and resolution of geosocial temporal signatures for points of interest , 2015, Comput. Environ. Urban Syst..

[30]  Daniele Quercia,et al.  Recommending Social Events from Mobile Phone Location Data , 2010, 2010 IEEE International Conference on Data Mining.

[31]  D. Ruths,et al.  Social media for large studies of behavior , 2014, Science.

[32]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[33]  T. Geisel,et al.  The scaling laws of human travel , 2006, Nature.

[34]  Albert-László Barabási,et al.  Limits of Predictability in Human Mobility , 2010, Science.

[35]  Damianos Gavalas,et al.  A web-based pervasive recommendation system for mobile tourist guides , 2011, Personal and Ubiquitous Computing.

[36]  Mao Ye,et al.  Exploiting geographical influence for collaborative point-of-interest recommendation , 2011, SIGIR.

[37]  Alexander Zipf,et al.  Explorative public transport flow analysis from uncertain social media data , 2014, GEOGROWD.

[38]  Michael R. Lyu,et al.  Where You Like to Go Next: Successive Point-of-Interest Recommendation , 2013, IJCAI.

[39]  Florian Michahelles,et al.  Social Networks in Pervasive Advertising and Shopping , 2011, Pervasive Advertising.

[40]  Alexander Zipf,et al.  Identifying the city center using human travel flows generated from location-based social networking data , 2016 .

[41]  Dino Pedreschi,et al.  Unveiling the complexity of human mobility by querying and mining massive trajectory data , 2011, The VLDB Journal.

[42]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[43]  Matthias Jarke,et al.  A Clustering Approach for Collaborative Filtering Recommendation Using Social Network Analysis , 2011, J. Univers. Comput. Sci..

[44]  Raphaël Troncy,et al.  Using social media to identify events , 2011, WSM '11.

[45]  Cecilia Mascolo,et al.  A Random Walk around the City: New Venue Recommendation in Location-Based Social Networks , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[46]  Ga Miller,et al.  Note on the bias of information estimates , 1955 .

[47]  Thomas Ertl,et al.  Visual Analysis of Movement Behavior Using Web Data for Context Enrichment , 2014, 2014 IEEE Pacific Visualization Symposium.