Identifying violent protest activity with scalable machine learning ∗

The outbreak and frequency of violent protest activity since 2010 has been a cause for alarm among policy makers and the public at large and has renewed interest in the study of violent forms of protest action. Until recently, the study of violent protest action, and indeed protest action in general has been limited to case studies (Tilly 1988; Tilly and Tarrow 2015), simulation studies (Epstein 2002) and newspaper accounts (Earl et al. 2004). With the widespread use of social media websites such as Twitter and Facebook as means of protest mobilization along with innovations in high dimensional statistics and machine learning researchers are now able to collect large and geographically diverse data for studying protest activity (Barberá et al. 2015; Metzger et al. 2016; Gerbaudo 2012; Tucker et al. 2014). In this paper, we build a series of scalable machine learning algorithms and software which jointly leverage spatial and textual data to identify violent and peaceful protest activity using English language Tweets. We then use our classifier to demonstrate how our software can be used by researchers to construct databases which measure violent and peaceful forms protest activity at fine-grained levels of time and geography and explore relationships between Census demographics and protest activity during the Ferguson protests in November 2015. Finally, we explore how linguistic and spatial features distinguish peaceful from violent forms of collective action. ∗All errors are our own. †ljanastas@uga.edu, http://scholar.harvard.edu/janastas ‡jakerylandwilliams@gmail.com, http://people.ischool.berkeley.edu/%7Ejakeryland/

[1]  Regina Branton,et al.  Social Protest and Policy Attitudes: The Case of the 2006 Immigrant Rallies , 2015 .

[2]  James P. Bagrow,et al.  Zipf's law is a consequence of coherent language production , 2016, 1601.07969.

[3]  Joshua A. Tucker,et al.  The Critical Periphery in the Growth of Social Protests , 2015, PloS one.

[4]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[5]  Social and Political Dimensions of Campus Protest Activity , 1972, The Journal of Politics.

[6]  P. Gerbaudo Tweets and the Streets: Social Media and Contemporary Activism , 2012 .

[7]  Jake Ryland Williams,et al.  Boundary-based MWE segmentation with text partitioning , 2016, NUT@EMNLP.

[8]  C. Tilly Collective Violence in European Perspective , 1978 .

[9]  G. Carter The 1960s Black Riots Revisited: City Level Explanations of Their Severity , 1986 .

[10]  Edi Winarko,et al.  Event detection in social media: A survey , 2013, International Conference on ICT for Smart Society.

[11]  Lei Chen,et al.  Event detection over twitter social media streams , 2013, The VLDB Journal.

[12]  James F. Wilson The strategy of protest: problems of negro civic action , 1961 .

[13]  Emiliano Huet-Vaughn Quiet Riot: The Causal Effect of Protest Violence , 2013 .

[14]  Sidney Tarrow,et al.  Unwanted Children - Political Violence and the Cycle of Protest in Italy, 1966-1973 , 1986 .

[15]  Richard Bonneau,et al.  Protest in the age of social media , 2015 .

[16]  C. Tilly From mobilization to revolution , 1978 .

[17]  Matthew Hurst,et al.  Event Detection and Tracking in Social Streams , 2009, ICWSM.

[18]  Paul May Ideological justifications for restrictive immigration policies: An analysis of parliamentary discourses on immigration in France and Canada (2006–2013) , 2016 .

[19]  Joshua A. Tucker,et al.  People Power or a One-Shot Deal? A Dynamic Model of Protest , 2013 .

[20]  James P. Bagrow,et al.  Human language reveals a universal positivity bias , 2014, Proceedings of the National Academy of Sciences.

[21]  Nils B. Weidmann,et al.  Violence and Ethnic Segregation: A Computational Model Applied to Baghdad , 2013 .

[22]  P. Torrens,et al.  Modeling Geographic Behavior in Riotous Crowds , 2013 .

[23]  Joshua A. Tucker,et al.  Tweeting identity? Ukrainian, Russian, and #Euromaidan , 2016 .

[24]  Robert A. Margo,et al.  The Economic Aftermath of the 1960s Riots in American Cities: Evidence from Property Values , 2007, The Journal of Economic History.

[25]  Christopher M. Danforth,et al.  Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter , 2011, PloS one.

[26]  M. Lipsky,et al.  Protest as a Political Resource , 1968, American Political Science Review.

[27]  Joshua M Epstein,et al.  Modeling civil violence: An agent-based computational approach , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[28]  J. D. McCarthy,et al.  The use of newspaper data in the study of collective action , 2003 .

[29]  W. Moore Repression and dissent: Substitution, context, and timing , 1998 .

[30]  L. Anastasopoulos An Experiment on the Policy Effects of Immigrant Skin Tone , 2015 .

[31]  Regina Branton,et al.  Agenda Setting, Public Opinion, and the Issue of Immigration Reform , 2007 .

[32]  S. Tarrow,et al.  Power in Movement: Social Movements, Collective Action and Politics , 1994 .

[33]  Dirk Helbing,et al.  Group Segregation and Urban Violence , 2013, SSRN Electronic Journal.

[34]  Michael Jones-Correa,et al.  Spatial and Temporal Proximity: Examining the Effects of Protests on Political Attitudes , 2014 .

[35]  M. Durfee,et al.  Contentious Politics , 2017 .

[36]  R. Sørensen After the immigration shock: The causal effect of immigration on electoral preferences , 2016 .

[37]  Pascal Frossard,et al.  Multiscale event detection in social media , 2014, Data Mining and Knowledge Discovery.