Text as Data for Conflict Research: A Literature Survey

Computer-aided text analysis (CATA) offers exciting new possibilities for conflict research that this contribution describes using a range of exemplary studies from a variety of disciplines including sociology, political science, communication studies, and computer science. The chapter synthesizes empirical research that investigates conflict in relation to text across different formats and genres. This includes both conflict as it is verbalized in the news media, in political speeches, and other public documents and conflict as it occurs in online spaces (social media platforms, forums) and that is largely confined to such spaces (e.g., flaming and trolling). Particular emphasis is placed on research that aims to find commonalities between online and offline conflict, and that systematically investigates the dynamics of group behavior. Both work using inductive computational procedures, such as topic modeling, and supervised machine learning approaches are assessed, as are more traditional forms of content analysis, such as dictionaries. Finally, cross-validation is highlighted as a crucial step in CATA, in order to make the method as useful as possible to scholars interested in enlisting text mining for conflict research.

[1]  Michael N Jones,et al.  Exploring media bias with semantic analysis tools: validation of the Contrast Analysis of Semantic Similarity (CASS) , 2011, Behavior research methods.

[2]  M. Laver,et al.  Estimating policy positions from political texts , 2000 .

[3]  Paul M. Kellstedt The Mass Media and the Dynamics of American Racial Attitudes: Media Framing and the Dynamics of Racial Policy Preferences , 2000 .

[4]  Justin Grimmer,et al.  Elevated threat levels and decreased expectations: How democracy handles terrorist threats , 2013 .

[5]  Carsten Q. Schneider,et al.  Comparing public communication in democracies and autocracies: automated text analyses of speeches by heads of government , 2020, Quality & Quantity.

[6]  Margaret E. Roberts,et al.  A Model of Text for Experimentation in the Social Sciences , 2016 .

[7]  Keren Tenenboim-Weinblatt,et al.  The search for common ground in conflict news research: Comparing the coverage of six current conflicts in domestic and international media over time , 2018 .

[8]  Philip J. Stone,et al.  Extracting Information. (Book Reviews: The General Inquirer. A Computer Approach to Content Analysis) , 1967 .

[9]  L. Hooghe,et al.  Explaining the salience of anti-elitism and reducing political corruption for political parties in Europe with the 2014 Chapel Hill Expert Survey data , 2017 .

[10]  F. Arendt,et al.  Content Analysis of Mediated Associations: An Automated Text-Analytic Approach , 2017 .

[11]  Laura K. Nelson,et al.  Computational Grounded Theory: A Methodological Framework , 2020 .

[12]  Daan Odijk,et al.  Teaching the Computer to Code Frames in News: Comparing Two Supervised Machine Learning Approaches to Frame Analysis , 2014 .

[13]  Paul DiMaggio,et al.  Adapting computational text analysis to social science (and vice versa) , 2015, Big Data Soc..

[14]  Stuart Soroka,et al.  Affective News: The Automated Coding of Sentiment in Political Texts , 2012 .

[15]  David Mimno,et al.  Low-dimensional Embeddings for Interpretable Anchor-based Topic Inference , 2014, EMNLP.

[16]  Claes H. de Vreese,et al.  Using Supervised Machine Learning to Code Policy Issues , 2015 .

[17]  Georgios Paltoglou,et al.  Signals of Public Opinion in Online Communication , 2015 .

[18]  Arthur Spirling,et al.  Text Preprocessing For Unsupervised Learning: Why It Matters, When It Misleads, And What To Do About It , 2017, Political Analysis.

[19]  Dan Mercea,et al.  Parametrizing Brexit: mapping Twitter political space to parliamentary constituencies , 2018 .

[20]  Richard Frank,et al.  The mediums and the messages: exploring the language of Islamic State media through sentiment analysis , 2018 .

[21]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[22]  Baekkwan Park,et al.  Machine Learning Human Rights and Wrongs: How the Successes and Failures of Supervised Learning Algorithms Can Inform the Debate About Information Effects , 2018, Political Analysis.

[23]  Christian Rauh,et al.  Reading Between the Lines: Prediction of Political Violence Using Newspaper Text , 2016, American Political Science Review.

[24]  Tim Loughran,et al.  When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks , 2010 .

[25]  Seraphine F. Maerz Simulating pluralism: the language of democracy in hegemonic authoritarianism , 2019, Political Research Exchange.

[26]  J. Pennebaker Using computer analyses to identify language style and aggressive intent: The secret life of function words , 2011 .

[27]  D. Tingley Rising Power on the Mind , 2017, International Organization.

[28]  Nicholas S. Holtzman,et al.  Exploring Political Ideologies of Senators With Semantic Analysis Tools , 2015 .

[29]  Margeret Hall,et al.  Event Prediction With Learning Algorithms-A Study of Events Surrounding the Egyptian Revolution of 2011 on the Basis of Micro Blog Data , 2015 .

[30]  Petter Törnberg,et al.  Muslims in social media discourse: Combining topic modeling and critical discourse analysis , 2016 .

[31]  Will Lowe,et al.  Multilingual Sentiment Analysis: A New Approach to Measuring Conflict in Legislative Speeches , 2018, Legislative Studies Quarterly.

[32]  Erin K. Jenne,et al.  Rhetoric of civil conflict management: United Nations Security Council debates over the Syrian civil war , 2017 .

[33]  I. Miller,et al.  Rebellion, crime and violence in Qing China, 1722–1911: A topic modeling approach , 2013 .

[34]  Damian Trilling Doing Computational Social Science with Python: An Introduction , 2018 .

[35]  Andrew F. Hayes,et al.  A Tutorial on Testing, Visualizing, and Probing an Interaction Involving a Multicategorical Variable in Linear Regression Analysis , 2017 .

[36]  Serena Villata,et al.  Tweeties Squabbling: Positive and Negative Results in Applying Argument Mining on Social Media , 2016, COMMA.

[37]  Arthur Spirling,et al.  Text Preprocessing For Unsupervised Learning: Why It Matters, When It Misleads, And What To Do About It , 2017, Political Analysis.

[38]  Justin Grimmer,et al.  Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts , 2013, Political Analysis.

[39]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[40]  A. Graesser,et al.  Computational linguistics analysis of leaders during crises in authoritarian regimes , 2016 .

[41]  Margaret E. Roberts,et al.  Computer-Assisted Text Analysis for Comparative Politics , 2015, Political Analysis.

[42]  M. Bradley,et al.  Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[43]  Barry C. Burden,et al.  Budget Rhetoric in Presidential Campaigns from 1952 to 2000 , 2003 .

[44]  Catherine L. Dumas,et al.  Examining political mobilization of online communities through e-petitioning behavior in We the People , 2015, Big Data Soc..

[45]  Peter F. Wignell,et al.  Interpreting text and image relations in violent extremist discourse: A mixed methods approach for big data analytics , 2016 .

[46]  Brandon M. Stewart,et al.  Use of force and civil–military relations in Russia: an automated content analysis , 2009 .

[47]  Volha Kananovich Framing the Taxation-Democratization Link: An Automated Content Analysis of Cross-National Newspaper Data , 2018 .

[48]  R. Hart,et al.  The rhetoric of Islamic activism: A DICTION study , 2011 .

[49]  Damian Trilling,et al.  Taking Stock of the Toolkit , 2016, Rethinking Research Methods in an Age of Digital Journalism.

[50]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[51]  Cristina Jayme Montiel,et al.  Nationalism in Local Media During International Conflict , 2014 .

[52]  Erin Smith Crabb,et al.  Using Structural Topic Modeling to Detect Events and Cluster Twitter Users in the Ukrainian Crisis , 2015, HCI.

[53]  Richard Frank,et al.  Changes and stabilities in the language of Islamic state magazines: a sentiment analysis , 2018 .

[54]  Carina Jacobi,et al.  Quantitative analysis of large amounts of journalistic texts using topic modelling , 2016, Rethinking Research Methods in an Age of Digital Journalism.

[55]  Rochelle Terman Islamophobia and Media Portrayals of Muslim Women: A Computational Text Analysis of US News Coverage , 2017 .

[56]  Margaret E. Roberts,et al.  stm: An R Package for Structural Topic Models , 2019, Journal of Statistical Software.

[57]  David G. Rand,et al.  Structural Topic Models for Open‐Ended Survey Responses , 2014, American Journal of Political Science.

[58]  Benjamin E. Bagozzi,et al.  The Politics of Scrutiny in Human Rights Monitoring: Evidence from Structural Topic Models of US State Department Human Rights Reports , 2016, Political Science Research and Methods.

[59]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[60]  Derek Greene,et al.  How Many Topics? Stability Analysis for Topic Models , 2014, ECML/PKDD.

[61]  Lisa Kaati,et al.  Detecting Linguistic Markers for Radical Violence in Social Media , 2014 .

[62]  Seth C. Lewis,et al.  Content Analysis and the Algorithmic Coder , 2015 .

[63]  Ariadna Matamoros Fernández,et al.  Hate Speech and Covert Discrimination on Social Media: Monitoring the Facebook Pages of Extreme-Right Political Parties in Spain , 2016 .

[64]  Matthew Leighton Williams,et al.  Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making , 2015 .

[65]  Wouter van Atteveldt,et al.  Good News or Bad News? Conducting Sentiment Analysis on Dutch Text to Distinguish Between Positive and Negative Relations , 2008 .

[66]  Michael J. Jensen,et al.  Explaining the “ebb and flow” of the problem stream: frame conflicts over the future of coal seam gas (“fracking”) in Australia , 2018, Journal of Public Policy.

[67]  Michael Scharkow,et al.  Thematic content analysis using supervised machine learning: An empirical evaluation using German online news , 2011, Quality & Quantity.