Image Analysis Enhanced Event Detection from Geo-Tagged Tweet Streams

Events detected from social media streams often include early signs of accidents, crimes or disasters. Therefore, they can be used by related parties for timely and efficient response. Although significant progress has been made on event detection from tweet streams, most existing methods have not considered the posted images in tweets, which provide richer information than the text, and potentially can be a reliable indicator of whether an event occurs or not. In this paper, we design an event detection algorithm that combines textual, statistical and image information, following an unsupervised machine learning approach. Specifically, the algorithm starts with semantic and statistical analyses to obtain a list of tweet clusters, each of which corresponds to an event candidate, and then performs image analysis to separate events from non-events—a convolutional autoencoder is trained for each cluster as an anomaly detector, where a part of the images are used as the training data and the remaining images are used as the test instances. Our experiments on multiple datasets verify that when an event occurs, the mean reconstruction errors of the training and test images are much closer, compared with the case where the candidate is a non-event cluster. Based on this finding, the algorithm rejects a candidate if the difference is larger than a threshold. Experimental results over millions of tweets demonstrate that this image analysis enhanced approach can significantly increase the precision with minimum impact on the recall.

[1]  Jun Hu,et al.  What Is New in Our City? A Framework for Event Extraction Using Social Media Posts , 2015, PAKDD.

[2]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[3]  Shaowen Wang,et al.  GeoBurst+ , 2018, ACM Trans. Intell. Syst. Technol..

[4]  Rui Li,et al.  TEDAS: A Twitter-based Event Detection and Analysis System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[5]  Liyuan Liu,et al.  TrioVecEvent: Embedding-Based Online Local Event Detection in Geo-Tagged Tweet Streams , 2017, KDD.

[6]  George Valkanas,et al.  How the live web feels about events , 2013, CIKM.

[7]  Maximilian Walther,et al.  Geo-spatial Event Detection in the Twitter Stream , 2013, ECIR.

[8]  Mehmet A. Orgun,et al.  Real-time event detection from the Twitter data stream using the TwitterNews+ Framework , 2019, Inf. Process. Manag..

[9]  Michael Gertz,et al.  EvenTweet: Online Localized Event Detection from Twitter , 2013, Proc. VLDB Endow..

[10]  Hanan Samet,et al.  The Quadtree and Related Hierarchical Data Structures , 1984, CSUR.

[11]  Pericles A. Mitkas,et al.  Event Detection via LDA for the MediaEval2012 SED Task , 2012, MediaEval.

[12]  Hanan Samet,et al.  Detecting latest local events from geotagged tweet streams , 2018, SIGSPATIAL/GIS.

[13]  Christopher Leckie,et al.  Multi-spatial Scale Event Detection from Geo-tagged Tweet Streams via Power-law Verification , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[14]  Michelle X. Zhou,et al.  Event detection with social media data , 2012 .

[15]  Brian Regan,et al.  Fusing Text and Image for Event Detection in Twitter , 2015, ArXiv.

[16]  Yogesh Virkar,et al.  Power-law distributions in binned empirical data , 2012, 1208.3524.

[17]  Keiji Yanai,et al.  Event photo mining from Twitter using keyword bursts and image clustering , 2016, Neurocomputing.

[18]  Ke Wang,et al.  TopicSketch: Real-Time Bursty Topic Detection from Twitter , 2013, 2013 IEEE 13th International Conference on Data Mining.

[19]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[20]  Hila Becker,et al.  Beyond Trending Topics: Real-World Event Identification on Twitter , 2011, ICWSM.

[21]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[22]  Shaowen Wang,et al.  GeoBurst: Real-Time Local Event Detection in Geo-Tagged Tweet Streams , 2016, SIGIR.

[23]  Roberto Frias,et al.  Twitter event detection: combining wavelet analysis and topic inference summarization , 2011 .

[24]  Dimitrios Gunopulos,et al.  Detecting Events in Online Social Networks: Definitions, Trends and Challenges , 2016, Solving Large Scale Learning Tasks.

[25]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.