Social Media data: Challenges, opportunities and limitations in urban studies

Abstract Analysing the city through data retrieved from Location Based Social Networks (LBSNs) has received considerable attention as a promising method for applied research. However, the use of these data is not without its challenges and has given rise to a stream of polemical arguments over the validity of this source of information. This paper addresses the challenges and opportunities as well as some of the limitations and biases associated with the collection and use of LBSN data from Foursquare, Twitter, Google Places, Instagram and Airbnb in the context of urban phenomena research. The most recent research that uses LBSN data to understand city dynamics is presented. A method is proposed for LBSN data retrieval, selection, classification and analysis. In addition, key thematic research lines are identified given the data variables offered by these LBSNs. A comprehensive and descriptive framework for the study of urban phenomena through LBSN data is the main contribution of this study.

[1]  James A. Cheshire,et al.  Deriving retail centre locations and catchments from geo-tagged Twitter data , 2017, Comput. Environ. Urban Syst..

[2]  Ning Wang,et al.  Assessing the bias in samples of large online networks , 2014, Soc. Networks.

[3]  Krzysztof Janowicz,et al.  Extracting and understanding urban areas of interest using geotagged photos , 2015, Comput. Environ. Urban Syst..

[4]  Pablo Barberá,et al.  Understanding the Political Representativeness of Twitter Users , 2015 .

[5]  Huan Liu,et al.  Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose , 2013, ICWSM.

[6]  Alexandra Georgakopoulou-Nunes,et al.  The Sage Handbook of Social Media Research Methods , 2017 .

[7]  Michele Campagna,et al.  Social Media Geographic Information: Why social is special when it goes spatial? , 2016 .

[8]  Michael J. Paul,et al.  Twitter: big data opportunities. , 2014, Science.

[9]  J. Jacobs The Death and Life of Great American Cities , 1962 .

[10]  Javier Vázquez-Salceda,et al.  Discovery of spatio-temporal patterns from location-based social networks , 2016, J. Exp. Theor. Artif. Intell..

[11]  Jussara M. Almeida,et al.  Revealing the City That We Cannot See , 2014, TOIT.

[12]  Luke S Sloan,et al.  Who Tweets with Their Location? Understanding the Relationship between Demographic Characteristics and the Use of Geoservices and Geotagging on Twitter , 2015, PloS one.

[13]  Jason I. Hong,et al.  Using Social Media Data to Understand Cities , 2014 .

[14]  A. Condeço-Melhorado,et al.  City dynamics through Twitter: Relationships between land use and spatiotemporal demographics , 2018 .

[15]  Andrea Galli,et al.  Mapping Socials: A Voluntary Map of a Great Event in Monza Park , 2017 .

[16]  Yang Chen,et al.  Measurement and analysis of tips in foursquare , 2016, 2016 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops).

[17]  Katrin Weller,et al.  Think before you collect: Setting up a data collection approach for social media studies , 2016, ArXiv.

[18]  Leandro Tortosa,et al.  Measuring urban activities using Foursquare data and network analysis: a case study of Murcia (Spain) , 2017, Int. J. Geogr. Inf. Sci..

[19]  Pablo Aragón,et al.  Spanish Indignados and the evolution of the 15M movement on Twitter: towards networked para-institutions , 2014 .

[20]  Paulina Aliandu,et al.  Sentiment Analysis to Determine Accommodation, Shopping and Culinary Location on Foursquare in Kupang City☆ , 2015 .

[21]  David W. S. Wong,et al.  Modeling and Visualizing Regular Human Mobility Patterns with Uncertainty: An Example Using Twitter Data , 2015 .

[22]  Pablo Martí,et al.  Comparing Two Residential Suburban Areas in the Costa Blanca, Spain , 2014 .

[23]  Kazutoshi Sumiya,et al.  Urban area characterization based on crowd behavioral lifelogs over Twitter , 2012, Personal and Ubiquitous Computing.

[24]  Lei Yang,et al.  We know what @you #tag: does the dual role affect hashtag adoption? , 2012, WWW.

[25]  Wangshu Wang Using Location-Based Social Media for Ranking Individual Familiarity with Places: A Case Study with Foursquare Check-in Data , 2014, LBS.

[26]  D. Blahna,et al.  Making Sense of Human Ecology Mapping: An Overview of Approaches to Integrating Socio-Spatial Data into Environmental Planning , 2013 .

[27]  Thiago H. Silva,et al.  Beyond Sights: Large Scale Study of Tourists' Behavior Using Foursquare Data , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[28]  Paul A. Longley,et al.  The geography of Twitter topics in London , 2016, Comput. Environ. Urban Syst..

[29]  Rossano Schifanella,et al.  The Digital Life of Walkable Streets , 2015, WWW.

[30]  Brent J. Hecht,et al.  A Tale of Cities: Urban Biases in Volunteered Geographic Information , 2014, ICWSM.

[31]  Alexander Zipf,et al.  Twitter as an indicator for whereabouts of people? Correlating Twitter with UK census data , 2015, Comput. Environ. Urban Syst..

[32]  Justus Uitermark,et al.  How to Study the City on Instagram , 2016, PloS one.

[33]  Matthew Zook,et al.  Social Media and the City: Rethinking Urban Socio-Spatial Inequality Using User-Generated Geographic Information , 2015 .

[34]  J. Diamond A sense of place , 1997, Nature.

[35]  Francisco C. Pereira,et al.  Mining point-of-interest data from social networks for urban land use classification and disaggregation , 2015, Comput. Environ. Urban Syst..

[36]  Cecilia Mascolo,et al.  An Empirical Study of Geographic User Activity Patterns in Foursquare , 2011, ICWSM.

[37]  Lisha Singh,et al.  A dive into Web Scraper world , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[38]  R. Kitchin,et al.  Big data and human geography , 2013 .

[39]  Belinda A. Chiera,et al.  Visualizing Big Data: Everything Old Is New Again , 2017 .

[40]  Timothy Baldwin,et al.  Text-Based Twitter User Geolocation Prediction , 2014, J. Artif. Intell. Res..

[41]  Hanan Samet,et al.  The Quadtree and Related Hierarchical Data Structures , 1984, CSUR.

[42]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[43]  Steven Schockaert,et al.  Using social media to find places of interest: a case study , 2012, GEOCROWD '12.

[44]  Cecilia Mascolo,et al.  A Tale of Many Cities: Universal Patterns in Human Urban Mobility , 2011, PloS one.

[45]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[46]  Frank Bentley,et al.  Beyond the bar: the places where location-based services are used in the city , 2014, Personal and Ubiquitous Computing.

[47]  Jordi Nin,et al.  Citizen in Sensor Networks , 2014, Lecture Notes in Computer Science.

[48]  Daniel Villatoro,et al.  The TweetBeat of the City: Microblogging Used for Discovering Behavioural Patterns during the MWC2012 , 2012, CitiSens.

[49]  Peter Nijkamp,et al.  Cyber Cities: Social Media as a Tool for Understanding Cities , 2015 .

[50]  María Jesús Such Devesa,et al.  Turismo colaborativo: ¿está AirBnB transformando el sector del alojamiento? , 2016 .

[51]  Michiel de Lange,et al.  Owning the city: New media and citizen engagement in urban design , 2013, First Monday.

[52]  Ameigh Ay We are here, now. , 1997 .

[53]  Kevin Lynch,et al.  The Image of the City , 1960 .

[54]  Daniele Quercia,et al.  Playful Cities: Crowdsourcing Urban Happiness with Web Games , 2016 .

[55]  Steven Schockaert,et al.  Detecting Places of Interest Using Social Media , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[56]  Carlos Granell,et al.  Beyond data collection: Objectives and methods of research using VGI and geo-social media for disaster management , 2016, Comput. Environ. Urban Syst..

[57]  Norman Chonacky Where in this World Are You? , 2008, Comput. Sci. Eng..

[58]  Danah Boyd,et al.  I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience , 2011, New Media Soc..

[59]  Hideyuki Fujita,et al.  Geo-tagged Twitter collection and visualization system , 2013 .

[60]  Rossano Schifanella,et al.  Smelly Maps: The Digital Life of Urban Smellscapes , 2015, ICWSM.

[61]  Daniele Quercia,et al.  Mining Urban Deprivation from Foursquare: Implicit Crowdsourcing of City Land Use , 2014, IEEE Pervasive Computing.

[62]  Anthony Stefanidis,et al.  Triangulating Social Multimedia Content for Event Localization using Flickr and Twitter , 2015, Trans. GIS.

[63]  Daniele Quercia Chatty, Happy, and Smelly Maps , 2015, WWW.

[64]  David Brown,et al.  Overview – The Social Media Data Processing Pipeline , 2016 .

[65]  David Fisher,et al.  Geolocated social media as a rapid indicator of park visitation and equitable park access , 2018, Comput. Environ. Urban Syst..

[66]  Daniel Arribas-Bel,et al.  Accidental, open and everywhere: Emerging data sources for the understanding of cities , 2014 .

[67]  G. Paulus,et al.  Mapping the porosity of international border to pedestrian traffic: a comparative data classification approach to a study of the border region in Austria, Italy, and Slovenia , 2013 .

[68]  Michael F. Goodchild,et al.  The quality of big (geo)data , 2013 .

[69]  J. MacLaughlin Postmodern geographies: The reassertion of space in critical social theory , 1994 .

[70]  Javier Vázquez-Salceda,et al.  Discovery of Spatio-Temporal Patterns from Location Based Social Networks , 2014, CCIA.

[71]  Zeynep Tufekci,et al.  Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls , 2014, ICWSM.

[72]  Michael F. Goodchild,et al.  The convergence of GIS and social media: challenges for GIScience , 2011, Int. J. Geogr. Inf. Sci..

[73]  Paul A. Longley,et al.  Social dynamics of Twitter usage in London, Paris, and New York City , 2014, First Monday.

[74]  Pablo Martí,et al.  Using locative social media and urban cartographies to identify and locate successful urban plazas , 2017 .

[75]  Xiang Li,et al.  Explore Spatiotemporal and Demographic Characteristics of Human Mobility via Twitter: A Case Study of Chicago , 2015, ArXiv.

[76]  Leticia Serrano-Estrada,et al.  Percepción y uso social de una transformación urbana a través del social media. Las setas gigantes de la calle San Francisco , 2016 .

[77]  Susanne Heuser,et al.  Location Based Social Networks – Definition, Current State of the Art and Research Agenda , 2013, Trans. GIS.

[78]  Anil Bawa-Cavia,et al.  Sensing the Urban Using location-based social network data in urban analysis Working , 2011 .

[79]  Bonnie Lindstrom A SENSE OF PLACE: Housing Selection on Chicago's North Shore , 1996 .

[80]  Leandro Tortosa,et al.  Using Data from Foursquare Web Service to Represent the Commercial Activity of a City , 2015 .

[81]  Kyumin Lee,et al.  Exploring Millions of Footprints in Location Sharing Services , 2011, ICWSM.

[82]  Stuart M. Allen,et al.  Personality and location-based social networks , 2015, Comput. Hum. Behav..

[83]  Joseph A. Paradiso,et al.  Dual Reality: Merging the Real and Virtual , 2009, FaVE.

[84]  Alexander Dunkel,et al.  Visualizing the perceived environment using crowdsourced photo geodata , 2015 .

[85]  A.P.J. van den Bosch,et al.  Dealing with big data: The case of Twitter , 2013, CLIN 2013.

[86]  Kathleen M. Carley,et al.  Two 1%s Don't Make a Whole: Comparing Simultaneous Samples from Twitter's Streaming API , 2014, SBP.

[87]  Paul Thomas,et al.  Finding, Weighting and Describing Venues: CSIRO at the 2012 TREC Contextual Suggestion Track , 2012, TREC.

[88]  Lev Manovich,et al.  Zooming into an Instagram City: Reading the local through social media , 2013, First Monday.

[89]  Sarah Abdullah Al-ghamdi,et al.  Rethinking Image of the City in the Information Age , 2015 .

[90]  Shaowen Wang,et al.  Mapping the global Twitter heartbeat: The geography of Twitter , 2013, First Monday.

[91]  Ling Chen,et al.  Event detection from flickr data through wavelet-based spatial analysis , 2009, CIKM.

[92]  Brian H. Spitzberg,et al.  Mapping social activities and concepts with social media (Twitter) and web search engines (Yahoo and Bing): a case study in 2012 US Presidential Election , 2013 .

[93]  S. Iliffe,et al.  Bmc Medical Research Methodology Open Access the Hawthorne Effect: a Randomised, Controlled Trial , 2007 .

[94]  Jinhua Zhao,et al.  C-IMAGE: city cognitive mapping through geo-tagged photos , 2016 .

[95]  H. Roberts Using Twitter data in urban green space research: A case study and critical evaluation , 2017 .

[96]  Scott A. Hale,et al.  Where in the World Are You? Geolocation and Language Identification in Twitter* , 2013, ArXiv.

[97]  Brian Wildsmith,et al.  What a Tale , 1987 .

[98]  Shawn D. Newsam,et al.  Quantitative Comparison of Open-Source Data for Fine-Grain Mapping of Land Use , 2017, UrbanGIS@SIGSPATIAL.

[99]  Luke Sloan,et al.  Social science 'Lite'? Deriving demographic proxies from Twitter , 2017 .

[100]  Rowan Wilken,et al.  Places nearby: Facebook as a location-based social media platform , 2014, New Media Soc..

[101]  Melissa Haithcox-Dennis Foursquare , 2011 .