Online Hate Interpretation Varies by Country, But More by Individual: A Statistical Analysis Using Crowdsourced Ratings

Hate is prevalent in online social media. This has resulted in a considerable amount of research in detecting and scoring it. Most computational efforts involve machine learning with crowdsourced ratings as training data. A prominent example of this is the Perspective API., a tool by Google to score toxicity of online comments. However., a major issue in the existing approaches is the lack of consideration for the subjective nature of online hate. While there is research that shows the intensity of hate varies and the hate depends on the context., there is no research that systematically investigates how hate interpretation varies by country or individual. In this exploratory research, we undertake this challenge. We sample crowd workers from 50 countries, have them score the same social media comments for toxicity and then evaluate the differences in the scores., altogether 18.,125 ratings. We find that the interpretation score differences among countries are highly significant. However., the hate interpretations vary more by the individual raters than by countries. These findings suggest that hate scoring systems should consider user-level features when scoring and automating the processing of online hate.

[1]  Mohammad Tubishat,et al.  Implicit aspect extraction in sentiment analysis: Review, taxonomy, oppportunities, and open challenges , 2018, Inf. Process. Manag..

[2]  Chris Van Pelt,et al.  Designing a scalable crowdsourcing platform , 2012, SIGMOD Conference.

[3]  Shivakant Mishra,et al.  Careful what you share in six seconds: Detecting cyberbullying instances in Vine , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[4]  Manoj Kumar Chinnakotla,et al.  Deep learning for detecting inappropriate content in text , 2018, International Journal of Data Science and Analytics.

[5]  Jing Zhou,et al.  Hate Speech Detection with Comment Embeddings , 2015, WWW.

[6]  Elizabeth F. Churchill,et al.  Automatic identification of personal insults on social news sites , 2012, J. Assoc. Inf. Sci. Technol..

[7]  Eleanor Mattern,et al.  From cyberbullying to well‐being: A narrative‐based participatory approach to values‐oriented design for social media , 2015, J. Assoc. Inf. Sci. Technol..

[8]  Derek Ruths,et al.  A Web of Hate: Tackling Hateful Speech in Online Social Spaces , 2017, ArXiv.

[9]  Jure Leskovec,et al.  Antisocial Behavior in Online Discussion Communities , 2015, ICWSM.

[10]  Minh Le Nguyen,et al.  Multilingual opinion mining on YouTube - A convolutional N-gram BiLSTM word embedding , 2018, Inf. Process. Manag..

[11]  K. Hazel Kwon,et al.  Is offensive commenting contagious online? Examining public vs interpersonal swearing in response to Donald Trump's YouTube campaign videos , 2017, Internet Res..

[12]  Aniket Kittur,et al.  An Assessment of Intrinsic and Extrinsic Motivation on Task Performance in Crowdsourcing Markets , 2011, ICWSM.

[13]  Wenyi Huang,et al.  Inferring nationalities of Twitter users and studying inter-national linking , 2014, HT.

[14]  Adrienne Massanari,et al.  #Gamergate and The Fappening: How Reddit’s algorithm, governance, and culture support toxic technocultures , 2017, New Media Soc..

[15]  James Hawdon,et al.  Online Extremism and Online Hate Exposure among Adolescents and Young Adults in Four Nations , 2015 .

[16]  Fabrício Benevenuto,et al.  Analyzing the Targets of Hate in Online Social Media , 2016, ICWSM.

[17]  Julia Hirschberg,et al.  Detecting Hate Speech on the World Wide Web , 2012 .

[18]  Radha Poovendran,et al.  Deceiving Google's Perspective API Built for Detecting Toxic Comments , 2017, ArXiv.

[19]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[20]  Shivakant Mishra,et al.  Analyzing Labeled Cyberbullying Incidents on the Instagram Social Network , 2015, SocInfo.

[21]  Germana Scepi,et al.  Combining different evaluation systems on social media for measuring user satisfaction , 2018, Inf. Process. Manag..

[22]  Mark Dredze,et al.  Annotating Named Entities in Twitter Data with Crowdsourcing , 2010, Mturk@HLT-NAACL.

[23]  Heri Ramampiaro,et al.  Effective hate-speech detection in Twitter data using recurrent neural networks , 2018, Applied Intelligence.

[24]  Apala Guha,et al.  The Impact of Toxic Language on the Health of Reddit Communities , 2017, Canadian Conference on AI.

[25]  A. Strauss,et al.  Grounded theory , 2017 .

[26]  Catherine C. Marshall,et al.  Debugging a Crowdsourced Task with Low Inter-Rater Agreement , 2015, JCDL.

[27]  Jacob Eisenstein,et al.  You Can't Stay Here , 2017 .

[28]  Zi Huang,et al.  A temporal context-aware model for user behavior modeling in social media systems , 2014, SIGMOD Conference.

[29]  Fabrício Benevenuto,et al.  A Measurement Study of Hate Speech in Social Media , 2017, HT.

[30]  James Hawdon,et al.  Targets of Online Hate: Examining Determinants of Victimization Among Young Finnish Facebook Users , 2016, Violence and Victims.

[31]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.