论文信息 - UPB at GermEval-2019 Task 2: BERT-Based Offensive Language Classification of German Tweets

UPB at GermEval-2019 Task 2: BERT-Based Offensive Language Classification of German Tweets

In this paper, we describe our participation to GermEval-2019 Task 2, which requires identifying and classifying offensive content in German tweets. For all three challenging subtasks, i.e. i) Subtask 1 – a binary classification between Offensive and NonOffensive tweets, ii) Subtask 2 – a finegrained classification into three different categories: Profanity, Insult, Abuse and iii) Subtask 3 – detecting whether the tweets contain Explicit or Implicit Offensive language, we used the Bidirectional Encoder Representations from Transformers (BERT) model with a pre-training phase based on German Wikipedia and German Twitter corpora and then performed fine-tuning on the competition dataset. Thus, our approach focuses on how to pre-train, fine-tune and deploy a BERT model to classify German tweets. Our best submission achieves on test data 76.95% average F1-score on Subtask 1, 53.59% on Subtask 2 and 70.84% on Subtask 3.

Dumitru-Clementin Cercel | Andrei Paraschiv

[1] Nikos Pelekis,et al. DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis , 2017, *SEMEVAL.

[2] Silvia Bernardini,et al. The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[3] Ona de Gibert,et al. Hate Speech Dataset from a White Supremacy Forum , 2018, ALW.

[4] Joaquín Padilla Montani,et al. GermEval 2018 : German Abusive Tweet Detection , 2018 .

[5] Michael Wiegand,et al. Overview of the GermEval 2018 Shared Task on the Identification of Offensive Language , 2018 .

[6] Lei Gao,et al. Detecting Online Hate Speech Using Context Aware Models , 2017, RANLP.

[7] Nane Kratzke,et al. The #BTW17 Twitter Dataset-Recorded Tweets of the Federal Election Campaigns of 2017 for the 19th German Bundestag , 2017, Data.

[8] Björn Ross,et al. Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis , 2016, ArXiv.

[9] Matthew Leighton Williams,et al. Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making , 2015 .

[10] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[11] Jörg Becker,et al. Discussing the Value of Automatic Hate Speech Detection in Online Debates , 2018 .

[12] Mike Schuster,et al. Japanese and Korean voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13] Preslav Nakov,et al. SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval) , 2019, *SEMEVAL.

[14] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[15] Zeerak Waseem,et al. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.

[16] Ritesh Kumar,et al. Benchmarking Aggression Identification in Social Media , 2018, TRAC@COLING 2018.

[17] Jianming Wang,et al. BNU-HKBU UIC NLP Team 2 at SemEval-2019 Task 6: Detecting Offensive Language Using BERT model , 2019, *SEMEVAL.

[18] Ingmar Weber,et al. Understanding Abuse: A Typology of Abusive Language Detection Subtasks , 2017, ALW@ACL.

[19] Athena Vakali,et al. A Unified Deep Learning Architecture for Abuse Detection , 2018, WebSci.

[20] Preslav Nakov,et al. Predicting the Type and Target of Offensive Posts in Social Media , 2019, NAACL.

[21] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[22] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[23] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).