A graph-based semi-supervised approach to classification learning in digital geographies

Abstract As the distinction between online and physical spaces rapidly degrades, social media have now become an integral component of how many people's everyday experiences are mediated. As such, increasing interest has emerged in exploring how the content shared through those online platforms comes to contribute to the collaborative creation of places in physical space at the urban scale. Exploring digital geographies of social media data using methods such as qualitative coding (i.e., content labelling) is a flexible but complex task, commonly limited to small samples due to its impracticality over large datasets. In this paper, we propose a new tool for studies in digital geographies, bridging qualitative and quantitative approaches, able to learn a set of arbitrary labels (qualitative codes) on a small, manually-created sample and apply the same labels on a larger set. We introduce a semi-supervised, deep neural network approach to classify geo-located social media posts based on their textual and image content, as well as geographical and temporal aspects. Our innovative approach is rooted in our understanding of social media posts as augmentations of the time-space configurations that places are, and it comprises a stacked multi-modal autoencoder neural network to create joint representations of text and images, and a spatio-temporal graph convolution neural network for semi-supervised classification. The results presented in this paper show that our approach performs the classification of social media content with higher accuracy than traditional machine learning models as well as two state-of-art deep learning frameworks.

[1]  Peter Zeile,et al.  Urban Emotions - Geo-Semantic Emotion Extraction from Technical Sensors, Human Sensors and Crowdsourced Data , 2014, LBS.

[2]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[3]  Guoyong Cai,et al.  Convolutional Neural Networks for Multimedia Sentiment Analysis , 2015, NLPCC.

[4]  Chao Chen,et al.  Detecting Non‐personal and Spam Users on Geo‐tagged Twitter Network , 2014, Trans. GIS.

[5]  Mariette Awad,et al.  Damage Identification in Social Media Posts using Multimodal Deep Learning , 2018, ISCRAM.

[6]  Tao Cheng,et al.  Event Detection using Twitter: A Spatio-Temporal Approach , 2014, PloS one.

[7]  Krzysztof Janowicz,et al.  Extracting and understanding urban areas of interest using geotagged photos , 2015, Comput. Environ. Urban Syst..

[8]  Rongrong Ji,et al.  SentiBank: large-scale ontology and classifiers for detecting sentiment and emotions in visual content , 2013, ACM Multimedia.

[9]  Matthew W. Wilson,et al.  Beyond the geotag: situating ‘big data’ and leveraging the potential of the geoweb , 2013 .

[10]  S. Cutter,et al.  Leveraging Twitter to gauge evacuation compliance: Spatiotemporal analysis of Hurricane Matthew , 2017, PloS one.

[11]  R. Kitchin,et al.  Digital turn, digital geographies? , 2018 .

[12]  Hugo Larochelle,et al.  Correlational Neural Networks , 2015, Neural Computation.

[13]  Jiebo Luo,et al.  Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks , 2015, AAAI.

[14]  Paul A. Longley,et al.  Geo-temporal Twitter demographics , 2016, Int. J. Geogr. Inf. Sci..

[15]  Alexander G. Hauptmann,et al.  Multimodal Filtering of Social Media for Temporal Monitoring and Event Analysis , 2018, ICMR.

[16]  Rob Procter,et al.  Mapping Consumer Sentiment Toward Wireless Services Using Geospatial Twitter Data , 2019, IEEE Access.

[17]  Dan Xu,et al.  Find you from your friends: Graph-based residence location prediction for users in social media , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[18]  Ross Purves,et al.  Exploring place through user-generated content: Using Flickr tags to describe city cores , 2010, J. Spatial Inf. Sci..

[19]  Hsin-Chang Yang,et al.  A Novel Approach for Event Detection by Mining Spatio-temporal Information on Microblogs , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[20]  The Utility of "Big Data" and Social Media for Anticipating, Preventing, and Treating Disease. , 2016, JAMA ophthalmology.

[21]  Matthew Zook,et al.  Beyond the geotag: situating ‘big data’ and leveraging the potential of the geoweb , 2013 .

[22]  Vladimir Vapnik,et al.  Support-vector networks , 2004, Machine Learning.

[23]  Ross Purves,et al.  Geographic variability of Twitter usage characteristics during disaster events , 2017, Geo spatial Inf. Sci..

[24]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[25]  Kazutoshi Sumiya,et al.  Urban Area Characterization Based on Semantics of Crowd Activities in Twitter , 2011, GeoS.

[26]  Igor Brigadir,et al.  Event Detection in Twitter using Aggressive Filtering and Hierarchical Tweet Clustering , 2014, SNOW-DC@WWW.

[27]  Jure Leskovec,et al.  Patterns of temporal variation in online media , 2011, WSDM '11.

[28]  Ali Farhadi,et al.  Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  A. Bruns,et al.  Twitter and Society , 2013 .

[31]  Bhavani M. Thuraisingham,et al.  Tweecalization: Efficient and intelligent location mining in twitter using semi-supervised learning , 2012, 8th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom).

[32]  Rob Kitchin,et al.  Flying through Code/Space: The Real Virtuality of Air Travel , 2004 .

[33]  Norjihan Abdul Ghani,et al.  Social media big data analytics: A survey , 2019, Comput. Hum. Behav..

[34]  Aasish Pappu,et al.  Inferring Advertiser Sentiment in Online Articles using Wikipedia Footnotes , 2019, WWW.

[35]  Matthew Zook,et al.  Towards a study of information geographies: (im)mutable augmentations and a mapping of the geographies of information , 2015 .

[36]  Yue Gao,et al.  Multimedia Social Event Detection in Microblog , 2015, MMM.

[37]  Mark Graham,et al.  An Informational Right to the City? Code, Content, Control, and the Urbanization of Information , 2017 .

[38]  Andrea Ballatore,et al.  Charting the Geographies of Crowdsourced Information in Greater London , 2018, AGILE Conf..

[39]  Matthew Zook,et al.  Mapping the Data Shadows of Hurricane Sandy: Uncovering the Sociospatial Dimensions of ‘Big Data’ , 2014 .

[40]  Graham Coleman,et al.  Detection and explanation of anomalous activities: representing activities as bags of event n-grams , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[41]  Matthew Zook,et al.  Mapping DigiPlace: Geocoded Internet Data and the Representation of Place , 2007 .

[42]  Kevin A. Henry,et al.  A Nationwide Comparison of Driving Distance Versus Straight-Line Distance to Hospitals , 2012, The Professional geographer : the journal of the Association of American Geographers.

[43]  Li-Jia Li,et al.  Visual Sentiment Prediction with Deep Convolutional Neural Networks , 2014, ArXiv.

[44]  Veerappa B. Pagi,et al.  Sentiment Analysis on Social Media , 2020, Handbook of Research on Emerging Trends and Applications of Machine Learning.

[45]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[46]  J. Gross,et al.  Graph Theory and Its Applications , 1998 .

[47]  M. Goodchild,et al.  Data-driven geography , 2014, GeoJournal.

[48]  Anthony Stefanidis,et al.  Triangulating Social Multimedia Content for Event Localization using Flickr and Twitter , 2015, Trans. GIS.

[49]  Matthew Zook,et al.  Making Big Data Small: Strategies to Expand Urban and Geographical Research Using Social Media , 2017 .

[50]  Mylynn Felt,et al.  Social media and the social sciences: How researchers employ Big Data analytics , 2016, Big Data Soc..

[51]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[52]  Ming-Hsiang Tsou,et al.  Spatial, temporal, and content analysis of Twitter for wildfire hazards , 2016, Natural Hazards.

[53]  Xiangfeng Luo,et al.  Building the Multi-Modal Storytelling of Urban Emergency Events Based on Crowdsensing of Social Media Analytics , 2016, Mobile Networks and Applications.

[54]  Hannah Awcock Contesting the capital : space, place, and protest in London, 1780-2010 , 2018 .

[55]  Nikos Deligiannis,et al.  Twitter data clustering and visualization , 2016, 2016 23rd International Conference on Telecommunications (ICT).

[56]  Pablo Martí,et al.  Social Media data: Challenges, opportunities and limitations in urban studies , 2019, Comput. Environ. Urban Syst..

[57]  Matthew Zook,et al.  Augmented Reality in Urban Places: Contested Content and the Duplicity of Code , 2013 .

[58]  Xiang Li,et al.  Explore Spatiotemporal and Demographic Characteristics of Human Mobility via Twitter: A Case Study of Chicago , 2015, ArXiv.

[59]  Xiaohui Yu,et al.  Weighted Co-Training for Cross-Domain Image Sentiment Classification , 2017, Journal of Computer Science and Technology.

[60]  Vanessa Frías-Martínez,et al.  Spectral clustering for sensing urban land use using Twitter activity , 2014, Engineering applications of artificial intelligence.

[61]  Yu-Bin Yang,et al.  Image Restoration Using Convolutional Auto-encoders with Symmetric Skip Connections , 2016, ArXiv.

[62]  Brian H. Spitzberg,et al.  Mapping social activities and concepts with social media (Twitter) and web search engines (Yahoo and Bing): a case study in 2012 US Presidential Election , 2013 .

[63]  David Abernathy Using Geodata and Geolocation in the Social Sciences: Mapping our Connected World , 2016 .

[64]  S. Elwood,et al.  New spatial media, new knowledge politics , 2013 .

[65]  Virgílio A. F. Almeida,et al.  Dengue surveillance based on a computational model of spatio-temporal locality of Twitter , 2011, WebSci '11.

[66]  David O'Sullivan,et al.  Geographic Information Analysis , 2002 .

[67]  Andrea Ballatore,et al.  Los Angeles as a digital place: The geographies of user‐generated content , 2020, Trans. GIS.

[68]  Andrew Scheil,et al.  Space and Place , 2012 .

[69]  Kazutoshi Sumiya,et al.  Discovery of unusual regional social activities using geo-tagged microblogs , 2011, World Wide Web.

[70]  Huan Ning,et al.  A visual–textual fused approach to automated tagging of flood-related tweets during a flood event , 2018, Int. J. Digit. Earth.

[71]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .

[72]  Rabindra Bista,et al.  Spatio-temporal Similarity Measure Algorithm for Moving Objects on Spatial Networks , 2007, ICCSA.