论文信息 - Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model

Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model

The enormous amount of data being generated on the web and social media has increased the demand for detecting online hate speech. Detecting hate speech will reduce their neg-ative impact and influence on others. A lot of effort in the Natural Language Processing (NLP) domain aimed to detect hate speech in general or detect specific hate speech such as religion, race, gender, or sexual orientation. Hate communities tend to use abbreviations, intentional spelling mistakes, and coded words in their communication to evade detection, adding more challenges to hate speech detec-tion tasks. Thus, word representation will play an increasingly pivotal role in detecting hate speech. This paper investigates the feasibil-ity of leveraging domain-specific word embed-ding in Bidirectional LSTM based deep model to automatically detect/classify hate speech. Furthermore, we investigate the use of the transfer learning language model (BERT) on hate speech problem as a binary classification task. The experiments showed that domain-specific word embedding with the Bidirec-tional LSTM based deep model achieved a 93% f1-score while BERT achieved up to 96% f1-score on a combined balanced dataset from available hate speech datasets.

[1] Lisa Singh,et al. Identification of extremism on Twitter , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[2] Zeerak Waseem,et al. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.

[3] Gianluca Stringhini,et al. Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior , 2018, ICWSM.

[4] Jianming Wang,et al. BNU-HKBU UIC NLP Team 2 at SemEval-2019 Task 6: Detecting Offensive Language Using BERT model , 2019, *SEMEVAL.

[5] El-Sayed M. El-Alfy,et al. Using Word Embedding and Ensemble Learning for Highly Imbalanced Data Sentiment Analysis in Short Arabic Text , 2017, ANT/SEIT.

[6] Joel R. Tetreault,et al. Abusive Language Detection in Online User Content , 2016, WWW.

[7] Hai Zhao,et al. Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network , 2015, ArXiv.

[8] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[9] Ashish Sureka,et al. Using KNN and SVM Based One-Class Classifier for Detecting Online Radicalization on Twitter , 2015, ICDCIT.

[10] Virgílio A. F. Almeida,et al. "Like Sheep Among Wolves": Characterizing Hateful Users on Twitter , 2017, ArXiv.

[11] Noel Crespi,et al. Hate speech detection and racial bias mitigation in social media based on BERT model , 2020, PloS one.

[12] Anne Weber,et al. Manual on Hate Speech , 2009 .

[13] Petra Kralj Novak,et al. Embeddia at SemEval-2019 Task 6: Detecting Hate with Neural Network and Transfer Learning Approaches , 2019, *SEMEVAL.