CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech

Although there is an unprecedented effort to provide adequate responses in terms of laws and policies to hate content on social media platforms, dealing with hatred online is still a tough problem. Tackling hate speech in the standard way of content deletion or user suspension may be charged with censorship and overblocking. One alternate strategy, that has received little attention so far by the research community, is to actually oppose hate content with counter-narratives (i.e. informed textual responses). In this paper, we describe the creation of the first large-scale, multilingual, expert-based dataset of hate speech/counter-narrative pairs. This dataset has been built with the effort of more than 100 operators from three different NGOs that applied their training and expertise to the task. Together with the collected data we also provide additional annotations about expert demographics, hate and response type, and data augmentation through translation and paraphrasing. Finally, we provide initial experiments to assess the quality of our data.

[1]  Kelly Reynolds,et al.  Using Machine Learning to Detect Cyberbullying , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.

[2]  Derek Ruths,et al.  Vectors for Counterspeech on Twitter , 2017, ALW@ACL.

[3]  Shivakant Mishra,et al.  Analyzing Labeled Cyberbullying Incidents on the Instagram Social Network , 2015, SocInfo.

[4]  Animesh Mukherjee,et al.  Thou shalt not hate: Countering Online Hate Speech , 2018, ICWSM.

[5]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[6]  Matthew Leighton Williams,et al.  Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making , 2015 .

[7]  Joelle Pineau,et al.  The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems , 2015, SIGDIAL Conference.

[8]  Felice Dell'Orletta,et al.  Hate Me, Hate Me Not: Hate Speech Detection on Facebook , 2017, ITASEC.

[9]  Izak Benbasat,et al.  A study of demographic embodiments of product recommendation agents in electronic commerce , 2010, Int. J. Hum. Comput. Stud..

[10]  Malvina Nissim,et al.  Bleaching Text: Abstract Features for Cross-lingual Gender Prediction , 2018, ACL.

[11]  Giovanni Vigna,et al.  Peer to Peer Hate: Hate Speech Instigators and Their Targets , 2018, ICWSM.

[12]  Ona de Gibert,et al.  Hate Speech Dataset from a White Supremacy Forum , 2018, ALW.

[13]  Pete Burnap,et al.  Us and them: identifying cyber hate on Twitter across multiple protected characteristics , 2016, EPJ Data Science.

[14]  Sebastian Schuster,et al.  Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog , 2018, NAACL.

[15]  Fabrício Benevenuto,et al.  Analyzing the Targets of Hate in Online Social Media , 2016, ICWSM.

[16]  Dirk Hovy,et al.  Demographic Factors Improve Classification Performance , 2015, ACL.

[17]  Julia Hirschberg,et al.  Detecting Hate Speech on the World Wide Web , 2012 .

[18]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[19]  Helen L. Norton,et al.  Intermediaries and Hate Speech: Fostering Digital Citizenship for Our Information Age , 2011 .

[20]  Shivakant Mishra,et al.  Careful what you share in six seconds: Detecting cyberbullying instances in Vine , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[21]  Kevin Munger Tweetment Effects on the Tweeted: Experimentally Reducing Racist Harassment , 2017 .

[22]  Dirk Hovy,et al.  The Social Impact of Natural Language Processing , 2016, ACL.

[23]  Joel R. Tetreault,et al.  Do Characters Abuse More Than Words? , 2016, SIGDIAL Conference.

[24]  Sara Tonelli,et al.  Creating a WhatsApp Dataset to Study Pre-teen Cyberbullying , 2018, ALW.

[25]  Lora Aroyo,et al.  Crowd vs. experts: nichesourcing for knowledge intensive tasks in cultural heritage , 2014, WWW '14 Companion.

[26]  Malvina Nissim,et al.  Overview of the EVALITA 2018 Cross-Genre Gender Prediction (GxG) Task , 2018, EVALITA@CLiC-it.

[27]  Animesh Mukherjee,et al.  Analyzing the hate and counter speech accounts on Twitter , 2018, ArXiv.

[28]  Cornelia Caragea,et al.  Content-Driven Detection of Cyberbullying on the Instagram Social Network , 2016, IJCAI.

[29]  Antoine Bordes,et al.  Training Millions of Personalized Dialogue Agents , 2018, EMNLP.

[30]  Elizabeth F. Churchill,et al.  Automatic identification of personal insults on social news sites , 2012, J. Assoc. Inf. Sci. Technol..

[31]  Stan Matwin,et al.  Boosting Text Classification Performance on Sexist Tweets by Text Augmentation and Text Generation Using a Combination of Knowledge Graphs , 2018, ALW.

[32]  Carolyn Penstein Rosé,et al.  Detecting offensive tweets via topical feature discovery over a large scale twitter corpus , 2012, CIKM.

[33]  Dirk Hovy,et al.  Cross-lingual syntactic variation over age and gender , 2015, CoNLL.

[34]  John Cardiff,et al.  Classifying Misogynistic Tweets Using a Blended Model: The AMI Shared Task in IBEREVAL 2018 , 2018, IberEval@SEPLN.

[35]  Paolo Rosso,et al.  Overview of the Evalita 2018 Task on Automatic Misogyny Identification (AMI) , 2018, EVALITA@CLiC-it.

[36]  A. Culotta,et al.  A Demographic Analysis of Online Sentiment during Hurricane Irene , 2012 .

[37]  Sérgio Nunes,et al.  A Survey on Automatic Detection of Hate Speech in Text , 2018, ACM Comput. Surv..

[38]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[39]  Björn Ross,et al.  Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis , 2016, ArXiv.

[40]  Lora Aroyo,et al.  Nichesourcing: Harnessing the Power of Crowds of Experts , 2012, EKAW.

[41]  Viviana Patti,et al.  14-ExLab@UniTo for AMI at IberEval2018: Exploiting Lexical Knowledge for Detecting Misogyny in English and Spanish Tweets , 2018, IberEval@SEPLN.

[42]  Malvina Nissim,et al.  RuG at GermEval: Detecting Offensive Speech in German Social Media , 2018 .

[43]  Njagi Dennis Gitari,et al.  A Lexicon-based Approach for Hate Speech Detection , 2015, MUE 2015.

[44]  David R. L. Worthington,et al.  Assessment of agreement among several raters formulating multiple diagnoses. , 1981, Journal of psychiatric research.

[45]  David Yarowsky,et al.  Exploring Demographic Language Variations to Improve Multilingual Sentiment Analysis in Social Media , 2013, EMNLP.

[46]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[47]  Raquel Fernández,et al.  Examining a hate speech corpus for hate speech detection and popularity prediction , 2018, ArXiv.

[48]  Jing Zhou,et al.  Hate Speech Detection with Comment Embeddings , 2015, WWW.

[49]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[50]  Michael Wiegand,et al.  A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[51]  Lucia Specia,et al.  Guiding Neural Machine Translation Decoding with External Knowledge , 2017, WMT.

[52]  Henry Lieberman,et al.  Common Sense Reasoning for Detection, Prevention, and Mitigation of Cyberbullying , 2012, TIIS.

[53]  Vasudeva Varma,et al.  Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[54]  Yuzhou Wang,et al.  Locate the Hate: Detecting Tweets against Blacks , 2013, AAAI.

[55]  Martine De Cock,et al.  Detecting Misogynous Tweets , 2018, IberEval@SEPLN.

[56]  Gary Bente,et al.  Hate Beneath the Counter Speech? A Qualitative Content Analysis of User Comments on YouTube Related to Counter Speech Videos , 2017 .