Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model

The enormous amount of data being generated on the web and social media has increased the demand for detecting online hate speech. Detecting hate speech will reduce their neg-ative impact and influence on others. A lot of effort in the Natural Language Processing (NLP) domain aimed to detect hate speech in general or detect specific hate speech such as religion, race, gender, or sexual orientation. Hate communities tend to use abbreviations, intentional spelling mistakes, and coded words in their communication to evade detection, adding more challenges to hate speech detec-tion tasks. Thus, word representation will play an increasingly pivotal role in detecting hate speech. This paper investigates the feasibil-ity of leveraging domain-specific word embed-ding in Bidirectional LSTM based deep model to automatically detect/classify hate speech. Furthermore, we investigate the use of the transfer learning language model (BERT) on hate speech problem as a binary classification task. The experiments showed that domain-specific word embedding with the Bidirec-tional LSTM based deep model achieved a 93% f1-score while BERT achieved up to 96% f1-score on a combined balanced dataset from available hate speech datasets.

[1]  Lisa Singh,et al.  Identification of extremism on Twitter , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[2]  Zeerak Waseem,et al.  Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.

[3]  Gianluca Stringhini,et al.  Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior , 2018, ICWSM.

[4]  Jianming Wang,et al.  BNU-HKBU UIC NLP Team 2 at SemEval-2019 Task 6: Detecting Offensive Language Using BERT model , 2019, *SEMEVAL.

[5]  El-Sayed M. El-Alfy,et al.  Using Word Embedding and Ensemble Learning for Highly Imbalanced Data Sentiment Analysis in Short Arabic Text , 2017, ANT/SEIT.

[6]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[7]  Hai Zhao,et al.  Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network , 2015, ArXiv.

[8]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[9]  Ashish Sureka,et al.  Using KNN and SVM Based One-Class Classifier for Detecting Online Radicalization on Twitter , 2015, ICDCIT.

[10]  Virgílio A. F. Almeida,et al.  "Like Sheep Among Wolves": Characterizing Hateful Users on Twitter , 2017, ArXiv.

[11]  Noel Crespi,et al.  Hate speech detection and racial bias mitigation in social media based on BERT model , 2020, PloS one.

[12]  Anne Weber,et al.  Manual on Hate Speech , 2009 .

[13]  Petra Kralj Novak,et al.  Embeddia at SemEval-2019 Task 6: Detecting Hate with Neural Network and Transfer Learning Approaches , 2019, *SEMEVAL.

[14]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[15]  Stephanie Seneff,et al.  Using word embedding for bio-event extraction , 2015, BioNLP@IJCNLP.

[16]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17]  Yoshua Bengio,et al.  Deep Learning for NLP (without Magic) , 2012, ACL.

[18]  Anna Liu,et al.  Neural Network Models for Hate Speech Classification in Tweets , 2018 .

[19]  Cody Buntain,et al.  A Large Labeled Corpus for Online Harassment Research , 2017, WebSci.

[20]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[21]  K. Robert Lai,et al.  Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model , 2016, ACL.

[22]  Tom De Smedt,et al.  Right-wing German Hate Speech on Twitter: Analysis and Automatic Detection , 2019, ArXiv.

[23]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[24]  Yun Zhu,et al.  Support vector machines and Word2vec for text classification with semantic features , 2015, 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC).

[25]  Yaoyun Zhang,et al.  Clinical Abbreviation Disambiguation Using Neural Word Embeddings , 2015, BioNLP@IJCNLP.

[26]  David Robinson,et al.  Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network , 2018, ESWC.

[27]  Jing Zhou,et al.  Hate Speech Detection with Comment Embeddings , 2015, WWW.

[28]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[29]  Jiebo Luo,et al.  Detecting the Hate Code on Social Media , 2017, ICWSM.

[30]  Vasudeva Varma,et al.  Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[31]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[32]  Heri Ramampiaro,et al.  Effective hate-speech detection in Twitter data using recurrent neural networks , 2018, Applied Intelligence.

[33]  Shivakant Mishra,et al.  International Conference on Advances in Social Networks Analysis and Mining ( ASONAM ) Are They Our Brothers ? Analysis and Detection of Religious Hate Speech in the Arabic Twittersphere , 2018 .