We need to go deeper: measuring electoral violence using convolutional neural networks and social media

Abstract Electoral violence is conceived of as violence that occurs contemporaneously with elections, and as violence that would not have occurred in the absence of an election. While measuring the temporal aspect of this phenomenon is straightforward, measuring whether occurrences of violence are truly related to elections is more difficult. Using machine learning, we measure electoral violence across three elections using disaggregated reporting in social media. We demonstrate that our methodology is more than 30 percent more accurate in measuring electoral violence than previously utilized models. Additionally, we show that our measures of electoral violence conform to theoretical expectations of this conflict more so than those that exist in event datasets commonly utilized to measure electoral violence including ACLED, ICEWS, and SCAD. Finally, we demonstrate the validity of our data by developing a qualitative coding ontology.

[1]  Emilie Marie Hafner-Burton,et al.  When Do Governments Resort to Election Violence , 2012 .

[2]  T. Zeitzoff,et al.  Using Social Media to Measure Conflict Dynamics : An Application to the 2008 – 2009 Gaza Conflict , 2011 .

[3]  Adi Shalev,et al.  Word Embeddings and Their Use In Sentence Classification Tasks , 2016, ArXiv.

[4]  Ursula Daxecker,et al.  All quiet on election day?: International election observation and incentives for pre-election violence in African elections , 2014 .

[5]  Jonathan Ronen,et al.  Social Networks and Protest Participation: Evidence from 93 Million Twitter Users , 2016 .

[6]  Samiran Sinha,et al.  Two Wrongs Make a Right: Addressing Underreporting in Binary Data from Multiple Sources , 2017, Political Analysis.

[7]  Hanan Samet,et al.  Identification of live news events using Twitter , 2011, LBSN '11.

[8]  Barbara J. Grosz,et al.  Natural-Language Processing , 1982, Artificial Intelligence.

[9]  Clionadh Raleigh,et al.  Introducing ACLED: An Armed Conflict Location and Event Dataset , 2010 .

[10]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[11]  John Beieler Generating Politically-Relevant Event Data , 2016, NLP+CSS@EMNLP.

[12]  Sarah Birch,et al.  The Dataset of Countries at Risk of Electoral Violence , 2017 .

[13]  A. Gelman,et al.  The garden of forking paths : Why multiple comparisons can be a problem , even when there is no “ fishing expedition ” or “ p-hacking ” and the research hypothesis was posited ahead of time ∗ , 2019 .

[14]  Thad Dunning,et al.  Fighting and Voting: Violent Conflict and Electoral Politics , 2011 .

[15]  Paul R. Brass,et al.  Theft of an Idol: Text and Context in the Representation of Collective Violence , 1997, The Journal of Asian Studies.

[16]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[17]  Ben He,et al.  Terrier : A High Performance and Scalable Information Retrieval Platform , 2022 .

[18]  Chang-Tien Lu,et al.  Forecasting Significant Societal Events Using The Embers Streaming Predictive Analytics System , 2014, Big Data.

[19]  Alessandro Moschitti,et al.  UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification , 2015, *SEMEVAL.

[20]  Cullen S. Hendrix,et al.  Social Conflict in Africa: A New Database , 2012 .

[21]  Aravind Srinivasan,et al.  'Beating the news' with EMBERS: forecasting civil unrest using open source indicators , 2014, KDD.

[22]  Nils B. Weidmann A Closer Look at Reporting Bias in Conflict Event Data , 2016 .

[23]  Susan D. Hyde,et al.  Which Elections Can Be Lost? , 2011, Political Analysis.

[24]  David Van Brackle,et al.  Automated Coding of Political Event Data , 2013 .

[25]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[26]  Benjamin E. Bagozzi,et al.  Using machine learning methods to identify atrocity perpetrators , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[27]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[28]  Vito D'Orazio,et al.  Separating the Wheat from the Chaff: Applications of Automated Document Classification Using Support Vector Machines , 2014, Political Analysis.

[29]  Omer Levy,et al.  word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.

[30]  Sanjeev Arora,et al.  Random Walks on Context Spaces: Towards an Explanation of the Mysteries of Semantic Word Embeddings , 2015, ArXiv.

[31]  Craig MacDonald,et al.  Using word embeddings in Twitter election classification , 2016, Information Retrieval Journal.

[32]  Noriko Kando,et al.  Increasing Reproducibility in IR: Findings from the Dagstuhl Seminar on "Reproducibility of Data-Oriented Experiments in e-Science" , 2016, SIGIR Forum.

[33]  Giorgio Gambosi,et al.  FUB, IASI-CNR, UNIVAQ at TREC 2011 Microblog Track , 2011, Text Retrieval Conference.

[34]  Cullen S. Hendrix,et al.  No News Is Good News: Mark and Recapture for Event Data When Reporting Probabilities Are Less Than One , 2015 .

[35]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[36]  Nils B. Weidmann On the Accuracy of Media-based Conflict Event Data , 2015 .

[37]  Philip A. Schrodt,et al.  Three's a Charm?: Open Event Data Coding with EL:DIABLO, PETRARCH, and the Open Event Data Alliance. , 2014 .

[38]  Yoav Goldberg,et al.  A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[39]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[40]  T. Zeitzoff Using Social Media to Measure Conflict Dynamics , 2011 .

[41]  Hanne Fjelde,et al.  Electoral Institutions and Electoral Violence in Sub-Saharan Africa , 2016 .

[42]  Benjamin E. Goldsmith,et al.  Elections, Ethnicity, and Political Instability , 2017 .

[43]  Justin Grimmer,et al.  Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts , 2013, Political Analysis.

[44]  Nils Zurawski,et al.  Violence and democracy , 2006 .

[45]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[46]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[47]  Philip A. Schrodt,et al.  Data-based Computational Approaches to Forecasting Political Violence , 2013 .

[48]  Nicola Ferro,et al.  SIGIR Initiative to Implement ACM Artifact Review and Badging , 2018, SIGF.

[49]  Jingrui He,et al.  Seeing the Forest through the Trees , 2018, Political Analysis.

[50]  Craig MacDonald,et al.  From Puppy to Maturity: Experiences in Developing Terrier , 2012, OSIR@SIGIR.

[51]  Craig MacDonald,et al.  Can Twitter Replace Newswire for Breaking News? , 2013, ICWSM.

[52]  José Luis Vicedo González,et al.  TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[53]  G. King,et al.  Improving Quantitative Studies of International Conflict: A Conjecture , 2000, American Political Science Review.

[54]  J. D. McCarthy,et al.  The use of newspaper data in the study of collective action , 2003 .

[55]  Alessandro Vespignani,et al.  Online social networks and offline protest , 2015, EPJ Data Science.

[56]  Hendrik Blockeel,et al.  Seeing the Forest Through the Trees , 2007, ILP.

[57]  S. P. Harish,et al.  The Political Violence Cycle , 2017, American Political Science Review.

[58]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[59]  Arthur A. Goldsmith,et al.  Electoral Violence in Africa Revisited , 2015 .

[60]  Kristine Höglund,et al.  Electoral Violence in Conflict-Ridden Societies: Concepts, Causes, and Consequences , 2009 .