TrioVecEvent: Embedding-Based Online Local Event Detection in Geo-Tagged Tweet Streams

Detecting local events (e.g., protest, disaster) at their onsets is an important task for a wide spectrum of applications, ranging from disaster control to crime monitoring and place recommendation. Recent years have witnessed growing interest in leveraging geo-tagged tweet streams for online local event detection. Nevertheless, the accuracies of existing methods still remain unsatisfactory for building reliable local event detection systems. We propose TrioVecEvent, a method that leverages multimodal embeddings to achieve accurate online local event detection. The effectiveness of TrioVecEvent is underpinned by its two-step detection scheme. First, it ensures a high coverage of the underlying local events by dividing the tweets in the query window into coherent geo-topic clusters. To generate quality geo-topic clusters, we capture short-text semantics by learning multimodal embeddings of the location, time, and text, and then perform online clustering with a novel Bayesian mixture model. Second, TrioVecEvent considers the geo-topic clusters as candidate events and extracts a set of features for classifying the candidates. Leveraging the multimodal embeddings as background knowledge, we introduce discriminative features that can well characterize local events, which enables pinpointing true local events from the candidate pool with a small amount of training data. We have used crowdsourcing to evaluate TrioVecEvent, and found that it improves the performance of the state-of-the-art method by a large margin.

[1]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Vanja Josifovski,et al.  Learning to Extract Local Events from the Web , 2015, SIGIR.

[4]  Chenliang Li,et al.  Twevent: segment-based event detection from tweets , 2012, CIKM.

[5]  Hanan Samet,et al.  TwitterStand: news in tweets , 2009, GIS.

[6]  Steffen Staab,et al.  Detecting non-gaussian geographical topics in tagged photo collections , 2014, WSDM.

[7]  Anthony K. H. Tung,et al.  Trendspedia: An Internet observatory for analyzing and visualizing the evolving web , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[8]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[9]  Alexander J. Smola,et al.  Discovering geographical topics in the twitter stream , 2012, WWW.

[10]  Ling Chen,et al.  Geo-SAGE: A Geographical Sparse Additive Generative Model for Spatial Item Recommendation , 2015, KDD.

[11]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[12]  Michael R. Lyu,et al.  Geo-Teaser: Geo-Temporal Sequential Embedding Rank for Point-of-interest Recommendation , 2016, WWW.

[13]  Jianhua Yin,et al.  A Text Clustering Algorithm Using an Online Clustering Scheme for Initialization , 2016, KDD.

[14]  Sergej Sizov,et al.  GeoFolk: latent spatial semantics in web 2.0 social media , 2010, WSDM '10.

[15]  Zhiguo Gong,et al.  A Nonparametric Model for Event Discovery in the Geospatial-Temporal Space , 2016, CIKM.

[16]  Kazufumi Watanabe,et al.  Jasmine: a real-time local-event detection system based on geolocation information propagated to microblogs , 2011, CIKM '11.

[17]  James Allan,et al.  On-Line New Event Detection and Tracking , 1998, SIGIR.

[18]  Oren Etzioni,et al.  Named Entity Recognition in Tweets: An Experimental Study , 2011, EMNLP.

[19]  Yiming Yang,et al.  Von Mises-Fisher Clustering Models , 2014, ICML.

[20]  Jieping Ye,et al.  Hierarchical Incomplete Multi-source Feature Learning for Spatiotemporal Event Forecasting , 2016, KDD.

[21]  Tat-Seng Chua,et al.  Fast Matrix Factorization for Online Recommendation with Implicit Feedback , 2016, SIGIR.

[22]  Philip S. Yu,et al.  Parameter Free Bursty Events Detection in Text Streams , 2005, VLDB.

[23]  Ee-Peng Lim,et al.  Analyzing feature trajectories for event detection , 2007, SIGIR.

[24]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[25]  Jiawei Han,et al.  Geographical topic discovery and comparison , 2011, WWW.

[26]  Nematollah Batmanghelich,et al.  Nonparametric Spherical Topic Modeling with Word Embeddings , 2016, ACL.

[27]  Shaowen Wang,et al.  Regions, Periods, Activities: Uncovering Urban Dynamics via Cross-Modal Representation Learning , 2017, WWW.

[28]  Eric Horvitz,et al.  Eyewitness: identifying local events via space-time signals in twitter feeds , 2015, SIGSPATIAL/GIS.

[29]  Nadia Magnenat-Thalmann,et al.  Who, where, when and what: discover spatio-temporal topics for twitter users , 2013, KDD.

[30]  Hector Garcia-Molina,et al.  Overview of multidatabase transaction management , 2005, The VLDB Journal.

[31]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[32]  E. Gutiérrez-Peña,et al.  A Bayesian Analysis of Directional Data Using the von Mises–Fisher Distribution , 2005 .

[33]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[34]  Liang Zhao,et al.  Multi-resolution Spatial Event Forecasting in Social Media , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[35]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[36]  Jieping Ye,et al.  Multi-Task Learning for Spatio-Temporal Event Forecasting , 2015, KDD.

[37]  Michael Gertz,et al.  EvenTweet: Online Localized Event Detection from Twitter , 2013, Proc. VLDB Endow..

[38]  Mauricio Quezada,et al.  Location-Aware Model for News Events in Social Media , 2015, SIGIR.

[39]  Luming Zhang,et al.  GMove: Group-Level Mobility Modeling Using Geo-Tagged Social Media , 2016, KDD.

[40]  Chao Zhang Real-Time Local Event Detection in GeoTagged Tweet Streams , 2017 .

[41]  Rui Li,et al.  TEDAS: A Twitter-based Event Detection and Analysis System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[42]  Shiguang Wang,et al.  Joint Localization of Events and Sources in Social Networks , 2015, 2015 International Conference on Distributed Computing in Sensor Systems.

[43]  Ling Chen,et al.  Event detection from flickr data through wavelet-based spatial analysis , 2009, CIKM.

[44]  Son Doan,et al.  An analysis of Twitter messages in the 2011 Tohoku Earthquake , 2011, eHealth.

[45]  Lei Cao,et al.  Online Outlier Exploration Over Large Datasets , 2015, KDD.

[46]  Wei Zhang,et al.  STREAMCUBE: Hierarchical spatio-temporal hashtag clustering for event exploration over the Twitter stream , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[47]  Aniket Kittur,et al.  Bridging the gap between physical location and online social networks , 2010, UbiComp.

[48]  Shaowen Wang,et al.  GeoBurst: Real-Time Local Event Detection in Geo-Tagged Tweet Streams , 2016, SIGIR.

[49]  Evangelos Kanoulas,et al.  Dynamic Clustering of Streaming Short Documents , 2016, KDD.

[50]  Shaohan Hu,et al.  DeepSense: A Unified Deep Learning Framework for Time-Series Mobile Sensing Data Processing , 2016, WWW.