Unsupervised Spatial Event Detection in Targeted Domains with Applications to Civil Unrest Modeling

Twitter has become a popular data source as a surrogate for monitoring and detecting events. Targeted domains such as crime, election, and social unrest require the creation of algorithms capable of detecting events pertinent to these domains. Due to the unstructured language, short-length messages, dynamics, and heterogeneity typical of Twitter data streams, it is technically difficult and labor-intensive to develop and maintain supervised learning systems. We present a novel unsupervised approach for detecting spatial events in targeted domains and illustrate this approach using one specific domain, viz. civil unrest modeling. Given a targeted domain, we propose a dynamic query expansion algorithm to iteratively expand domain-related terms, and generate a tweet homogeneous graph. An anomaly identification method is utilized to detect spatial events over this graph by jointly maximizing local modularity and spatial scan statistics. Extensive experiments conducted in 10 Latin American countries demonstrate the effectiveness of the proposed approach.

[1]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[2]  Gwendolyn Halford,et al.  CSIS Website: Center for Strategic and International Studies , 2000 .

[3]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[4]  James Allan,et al.  Text classification and named entities for new event detection , 2004, SIGIR '04.

[5]  F. Rao,et al.  Local modularity measure for network clusterizations. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[7]  Fan Wang,et al.  Instance Discovery and Schema Matching with Applications to Biological Deep Web Data Integration , 2010, 2010 IEEE International Conference on BioInformatics and BioEngineering.

[8]  Jiawei Han,et al.  Geographical topic discovery and comparison , 2011, WWW.

[9]  Deepayan Chakrabarti,et al.  Event Summarization Using Tweets , 2011, ICWSM.

[10]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[11]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[12]  M. de Rijke,et al.  Incorporating Query Expansion and Quality Indicators in Searching Microblog Posts , 2011, ECIR.

[13]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[14]  N. Kaneva Nation Branding: Toward an Agenda for Critical Research , 2011 .

[15]  Christopher Wilson,et al.  Digital Media in the Egyptian Revolution: Descriptive Analysis from the Tahrir Data Sets , 2011 .

[16]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[17]  Rui Li,et al.  TEDAS: A Twitter-based Event Detection and Analysis System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[18]  Dimitrios Gunopulos,et al.  On The Spatiotemporal Burstiness of Terms , 2012, Proc. VLDB Endow..

[19]  Daniel B. Neill,et al.  Fast subset scan for spatial pattern detection , 2012 .

[20]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[21]  Zeynep Tufekci,et al.  Social Media and the Decision to Participate in Political Protest: Observations From Tahrir Square , 2012 .

[22]  Rui Li,et al.  Towards Social Data Platform: Automatic Topic-focused Monitor for Twitter Stream , 2013, Proc. VLDB Endow..

[23]  Aravind Srinivasan,et al.  'Beating the news' with EMBERS: forecasting civil unrest using open source indicators , 2014, KDD.

[24]  Pemetaan Jumlah Balita,et al.  Spatial Scan Statistic , 2014, Encyclopedia of Social Network Analysis and Mining.