Challenges of Hate Speech Detection in Social Media

The detection of hate speech in social media is a crucial task. The uncontrolled spread of hate has the potential to gravely damage our society, and severely harm marginalized people or groups. A major arena for spreading hate speech online is social media. This significantly contributes to the difficulty of automatic detection, as social media posts include paralinguistic signals (e.g. emoticons, and hashtags), and their linguistic content contains plenty of poorly written text. Another difficulty is presented by the context-dependent nature of the task, and the lack of consensus on what constitutes as hate speech, which makes the task difficult even for humans. This makes the task of creating large labeled corpora difficult, and resource consuming. The problem posed by ungrammatical text has been largely mitigated by the recent emergence of deep neural network (DNN) architectures that have the capacity to efficiently learn various features. For this reason, we proposed a deep natural language processing (NLP) model—combining convolutional and recurrent layers—for the automatic detection of hate speech in social media data. We have applied our model on the HASOC2019 corpus, and attained a macro F1 score of 0.63 in hate speech detection on the test set of HASOC. The capacity of DNNs for efficient learning, however, also means an increased risk of overfitting. Particularly, with limited training data available (as was the case for HASOC). For this reason, we investigated different methods for expanding resources used. We have explored various opportunities, such as leveraging unlabeled data, similarly labeled corpora, as well as the use of novel models. Our results showed that by doing so, it was possible to significantly increase the classification score attained.

[1]  C. O’Regan Hate Speech Online: an (Intractable) Contemporary Challenge? , 2018 .

[2]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[3]  Claudia Bianchi,et al.  Slurs and appropriation: An echoic account , 2014 .

[4]  D. Coomans,et al.  Alternative k-nearest neighbour rules in supervised pattern recognition : Part 1. k-Nearest neighbour classification by using alternative voting rules , 1982 .

[5]  Rajkumar Saini,et al.  Hate Speech Detection Using Transformer Ensembles on the HASOC Dataset , 2020, SPECOM.

[6]  Bin Wang,et al.  YNU_Wb at HASOC 2019: Ordered Neurons LSTM with Attention for Identifying Hate Speech and Offensive Language , 2019, FIRE.

[7]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[8]  Yuzhou Wang,et al.  Locate the Hate: Detecting Tweets against Blacks , 2013, AAAI.

[9]  Vasudeva Varma,et al.  Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[10]  Ziqi Zhang,et al.  Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter , 2018, Semantic Web.

[11]  Ingmar Weber,et al.  Racial Bias in Hate Speech and Abusive Language Detection Datasets , 2019, Proceedings of the Third Workshop on Abusive Language Online.

[12]  S. Heyman Hate Speech, Public Discourse, and the First Amendment , 2008 .

[13]  Jugal K. Kalita,et al.  A Survey of the Usages of Deep Learning for Natural Language Processing , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Christopher Hom A puzzle about pejoratives , 2012 .

[15]  Joel R. Tetreault,et al.  Do Characters Abuse More Than Words? , 2016, SIGDIAL Conference.

[16]  Björn Ross,et al.  Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis , 2016, ArXiv.

[17]  Xuanjing Huang,et al.  How to Fine-Tune BERT for Text Classification? , 2019, CCL.

[18]  Preslav Nakov,et al.  SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval) , 2019, *SEMEVAL.

[19]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[20]  Yejin Choi,et al.  The Risk of Racial Bias in Hate Speech Detection , 2019, ACL.

[21]  Teona Gelashvili Hate Speech on Social Media: Implications of private regulation and governance gaps , 2018 .

[22]  Julia Hirschberg,et al.  Detecting Hate Speech on the World Wide Web , 2012 .

[23]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[24]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[25]  P. Martus,et al.  Do perceived working conditions and patient safety culture correlate with objective workload and patient outcomes: A cross-sectional explorative study from a German university hospital , 2019, PloS one.

[26]  Charless C. Fowlkes,et al.  Do We Need More Training Data or Better Models for Object Detection? , 2012, BMVC.

[27]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[28]  Björn Gambäck,et al.  Using Convolutional Neural Networks to Classify Hate-Speech , 2017, ALW@ACL.

[29]  Paolo Rosso,et al.  SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter , 2019, *SEMEVAL.

[30]  Alexander Brown,et al.  What is so special about online (as compared to offline) hate speech? , 2018 .

[31]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[32]  Nazli Goharian,et al.  Hate speech detection: Challenges and solutions , 2019, PloS one.

[33]  Tymoteusz Krumholc,et al.  NLPR@SRPOL at SemEval-2019 Task 6 and Task 5: Linguistically enhanced deep learning offensive sentence classifier , 2019, SemEval@NAACL-HLT.

[34]  Germán Sanchis-Trilles,et al.  Does more data always yield better translations? , 2012, EACL.

[35]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[36]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[37]  Yong Zhang,et al.  Dendritic spine detection using curvilinear structure detector and LDA classifier , 2007, NeuroImage.

[38]  Jing Zhou,et al.  Hate Speech Detection with Comment Embeddings , 2015, WWW.

[39]  Katharine Gelber,et al.  Evidencing the harms of hate speech , 2016 .

[40]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[41]  Rob van der Goot,et al.  sthruggle at SemEval-2019 Task 5: An Ensemble Approach to Hate Speech Detection , 2019, SemEval@NAACL-HLT.

[42]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[43]  P. Zimbardo The human choice: Individuation, reason, and order versus deindividuation, impulse, and chaos. , 1969 .

[44]  Prasenjit Majumder,et al.  Overview of the HASOC track at FIRE 2019: Hate Speech and Offensive Content Identification in Indo-European Languages , 2019, FIRE.

[45]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[46]  Matthew Leighton Williams,et al.  Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making , 2015 .

[47]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[48]  Thomas Wolf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[49]  Qun Liu,et al.  TinyBERT: Distilling BERT for Natural Language Understanding , 2020, EMNLP.

[50]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[51]  Pascale Fung,et al.  One-step and Two-step Classification for Abusive Language Detection on Twitter , 2017, ALW@ACL.

[52]  Zeerak Waseem,et al.  Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.

[53]  Udo Kruschwitz,et al.  Improving Hate Speech Detection with Deep Learning Ensembles , 2018, LREC.

[54]  Prakhar Gupta,et al.  Learning Word Vectors for 157 Languages , 2018, LREC.

[55]  Fabrício Benevenuto,et al.  A Measurement Study of Hate Speech in Social Media , 2017, HT.

[56]  Natalie Alkiviadou Hate speech on social media networks: towards a regulatory framework? , 2018, Information & Communications Technology Law.

[57]  Yang Xiang,et al.  A Two Phase Deep Learning Model for Identifying Discrimination from Tweets , 2016, EDBT.

[58]  Alan F. Smeaton,et al.  Classifying racist texts using a support vector machine , 2004, SIGIR '04.

[59]  Michael Wiegand,et al.  Overview of the GermEval 2018 Shared Task on the Identification of Offensive Language , 2018 .

[60]  David Robinson,et al.  Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network , 2018, ESWC.

[61]  Federico Liberatore,et al.  Detecting and Monitoring Hate Speech in Twitter , 2019, Sensors.

[62]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[63]  J. Wyatt,et al.  Slurs, roles and power , 2017, Philosophical Studies.

[64]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[65]  Claus Bahlmann,et al.  Online handwriting recognition with support vector machines - a kernel approach , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[66]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[67]  Pedro Alonso,et al.  TheNorth at HASOC 2019: Hate Speech Detection in Social Media Data , 2019, FIRE.

[68]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[69]  Mauro Conti,et al.  All You Need is "Love": Evading Hate Speech Detection , 2018, ArXiv.

[70]  Marcus Tomalin,et al.  Quarantining online hate speech: technical and ethical perspectives , 2019, Ethics and Information Technology.

[71]  Tomas Mikolov,et al.  Advances in Pre-Training Distributed Word Representations , 2017, LREC.

[72]  R. Dworkin A New Map of Censorship , 1994 .

[73]  Hongfei Lin,et al.  A Convolution-LSTM-Based Deep Neural Network for Cross-Domain MOOC Forum Post Classification , 2017, Inf..

[74]  Kiet Van Nguyen,et al.  Hate Speech Detection on Vietnamese Social Media Text using the Bidirectional-LSTM Model , 2019, ArXiv.

[75]  Felice Dell'Orletta,et al.  Hate Me, Hate Me Not: Hate Speech Detection on Facebook , 2017, ITASEC.

[76]  Preslav Nakov,et al.  Predicting the Type and Target of Offensive Posts in Social Media , 2019, NAACL.

[77]  Mari J. Matsuda Public Response to Racist Speech: Considering the Victim’s Story , 1989 .

[78]  Barbara Poblete,et al.  Hate Speech Detection is Not as Easy as You May Think: A Closer Look at Model Validation , 2019, SIGIR.

[79]  E. Barendt What Is the Harm of Hate Speech? , 2019, Ethical Theory and Moral Practice.

[80]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[81]  Ritesh Kumar,et al.  Benchmarking Aggression Identification in Social Media , 2018, TRAC@COLING 2018.

[82]  Carlos Ortiz,et al.  Intersectional Bias in Hate Speech and Abusive Language Datasets , 2020, ArXiv.

[83]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[84]  Bernard J. Jansen,et al.  Anatomy of Online Hate: Developing a Taxonomy and Machine Learning Models for Identifying and Classifying Hate in Online News Media , 2018, ICWSM.

[85]  Nitesh V. Chawla,et al.  Data Mining for Imbalanced Datasets: An Overview , 2005, The Data Mining and Knowledge Discovery Handbook.

[86]  Karsten Müller,et al.  Fanning the Flames of Hate: Social Media and Hate Crime , 2020, Journal of the European Economic Association.

[87]  Munmun De Choudhury,et al.  Prevalence and Psychological Effects of Hateful Speech in Online College Communities , 2019, WebSci.

[88]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.