'Beating the news' with EMBERS: forecasting civil unrest using open source indicators

We describe the design, implementation, and evaluation of EMBERS, an automated, 24x7 continuous system for forecasting civil unrest across 10 countries of Latin America using open source indicators such as tweets, news sources, blogs, economic indicators, and other data sources. Unlike retrospective studies, EMBERS has been making forecasts into the future since Nov 2012 which have been (and continue to be) evaluated by an independent T&E team (MITRE). Of note, EMBERS has successfully forecast the June 2013 protests in Brazil and Feb 2014 violent protests in Venezuela. We outline the system architecture of EMBERS, individual models that leverage specific data sources, and a fusion and suppression engine that supports trading off specific evaluation criteria. EMBERS also provides an audit trail interface that enables the investigation of why specific predictions were made along with the data utilized for forecasting. Through numerous evaluations, we demonstrate the superiority of EMBERS over baserate methods and its capability to forecast significant societal happenings.

[1]  Iadh Ounis,et al.  ACM SIGIR workshop on mathematical/formal methods in information retrieval MF/IR 2005 , 2005, SIGF.

[2]  Nathan Kallus,et al.  Predicting crowd behavior with big public data , 2014, WWW.

[3]  Federico Malucelli,et al.  Efficient Labelling Algorithms for the Maximum Noncrossing Matching Problem , 1993, Discret. Appl. Math..

[4]  Ricardo Campos,et al.  Future Retrieval: What Does the Future Talk About? , 2011, SIGIR 2011.

[5]  Kalev Leetaru,et al.  Culturomics 2.0: Forecasting large-scale human behavior using global news media tone in time and space , 2011, First Monday.

[6]  Leon Derczynski,et al.  TIMEN: An Open Temporal Expression Normalisation Resource , 2012, LREC.

[7]  D. H. Mellor,et al.  Real time , 1981 .

[8]  R. Michael Alvarez,et al.  Event History Modeling: A Guide for Social Scientists , 2004 .

[9]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[10]  R. Baeza-Yates Searching the Future , 2022 .

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  S. Goldsack,et al.  IN REAL-TIME , 2008 .

[13]  Philip A. Schrodt Automated Production of High-Volume, Near-Real-Time Political Event Data , 2011 .

[14]  Chang-Tien Lu,et al.  Analyzing Civil Unrest through Social Media , 2013, Computer.

[15]  Philip A. Schrodt Automated Production of High-Volume, Real-Time Political Event Data , 2010 .

[16]  Sean P. O'Brien,et al.  Crisis Early Warning and Decision Support: Contemporary Approaches and Thoughts on Future Research , 2010 .

[17]  Dan Braha,et al.  Global Civil Unrest: Contagion, Self-Organization, and Prediction , 2012, PloS one.

[18]  F. A. Kunneman,et al.  Predicting time-to-event from Twitter messages , 2013 .

[19]  Hila Becker,et al.  Identifying content for planned events across social media sites , 2012, WSDM '12.

[20]  Patrick T. Brandt,et al.  Real Time, Time Series Forecasting of Inter- and Intra-State Political Conflict , 2011 .

[21]  Yamir Moreno,et al.  The Dynamics of Protest Recruitment through an Online Network , 2011, Scientific reports.

[22]  Lise Getoor,et al.  Probabilistic Similarity Logic , 2010, UAI.