论文信息 - FraunhoferSIT at GermEval 2019: Can Machines Distinguish Between Offensive Language and Hate Speech? Towards a Fine-Grained Classification

FraunhoferSIT at GermEval 2019: Can Machines Distinguish Between Offensive Language and Hate Speech? Towards a Fine-Grained Classification

In this paper, we describe the FraunhoferSIT submission for the “GermEval 2019 – Shared Task on the Identification of Offensive Language”. We participated in two subtasks: task 1 is a binary classification of German tweets on the identification of offensive language. Task 2 is a fine-grained classification to distinguish between three subcategories of offensive language. Our best model is an SVM classifier based on tfidf character n-gram features. Our submitted runs in the shared task are: FraunhoferSIT coarse [1-3].txt for task 1 and FraunhoferSIT fine [1-3].txt for task 2. Our final system reaches 0.70 macro-average F1score for the binary classification and 0.46 F1-score for the fine-grained classification. The achieved results show that the problem of automatically distinguishing between offensive language and “Hate Speech” is far from being solved.

Inna Vogel | Roey Regev | Roey Regev | Inna Vogel

[1] Michael Wiegand,et al. A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[2] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[3] Jing Zhou,et al. Hate Speech Detection with Comment Embeddings , 2015, WWW.

[4] Felice Dell'Orletta,et al. Hate Me, Hate Me Not: Hate Speech Detection on Facebook , 2017, ITASEC.

[5] Jeannine Bell,et al. Hate Crimes: Criminal Law and Identity Politics , 2001 .

[6] Xiao-Ping Zhang,et al. Advances in Intelligent Computing, International Conference on Intelligent Computing, ICIC 2005, Hefei, China, August 23-26, 2005, Proceedings, Part I , 2005, ICIC.

[7] Vasudeva Varma,et al. Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[8] Joel R. Tetreault,et al. Abusive Language Detection in Online User Content , 2016, WWW.

[9] Ingmar Weber,et al. Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[10] Pascale Fung,et al. One-step and Two-step Classification for Abusive Language Detection on Twitter , 2017, ALW@ACL.

[11] Sérgio Nunes,et al. A Survey on Automatic Detection of Hate Speech in Text , 2018, ACM Comput. Surv..

[12] Josef Ruppenhofer,et al. Guidelines for IGGSA Shared Task on the Identification of Offensive Language , 2018 .