论文信息 - Studying Generalisability across Abusive Language Detection Datasets - 字舞流文

Studying Generalisability across Abusive Language Detection Datasets

Work on Abusive Language Detection has tackled a wide range of subtasks and domains. As a result of this, there exists a great deal of redundancy and non-generalisability between datasets. Through experiments on cross-dataset training and testing, the paper reveals that the preconceived notion of including more non-abusive samples in a dataset (to emulate reality) may have a detrimental effect on the generalisability of a model trained on that data. Hence a hierarchical annotation model is utilised here to reveal redundancies in existing datasets and to help reduce redundancy in future efforts.

Björn Gambäck | Anupam Jamatia | Steve Durairaj Swamy | Björn Gambäck | Anupam Jamatia

[1] Maite Taboada,et al. The SFU Opinion and Comments Corpus: A Corpus for the Analysis of Online News Comments , 2019, Corpus pragmatics : international journal of corpus linguistics and pragmatics.

[2] Cornelia Caragea,et al. Content-Driven Detection of Cyberbullying on the Instagram Social Network , 2016, IJCAI.

[3] Kyomin Jung,et al. Comparative Studies of Detecting Abusive Language on Twitter , 2018, ALW.

[4] Sérgio Nunes,et al. Merging Datasets for Aggressive Text Identification , 2018, TRAC@COLING 2018.

[5] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[6] Hal Daumé,et al. Frustratingly Easy Domain Adaptation , 2007, ACL.

[7] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[8] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[9] Dirk Hovy,et al. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[10] Preslav Nakov,et al. SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval) , 2019, *SEMEVAL.

[11] Björn Gambäck,et al. The Effects of User Features on Twitter Hate Speech Detection , 2018, ALW.

[12] Michael Wiegand,et al. A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[13] Jing Zhou,et al. Hate Speech Detection with Comment Embeddings , 2015, WWW.

[14] Walter Daelemans,et al. Detection and Fine-Grained Classification of Cyberbullying Events , 2015, RANLP.

[15] Sarah T. Roberts,et al. Behind the Screen , 2019 .

[16] Preslav Nakov,et al. Predicting the Type and Target of Offensive Posts in Social Media , 2019, NAACL.

[17] Shivakant Mishra,et al. Prediction of Cyberbullying Incidents on the Instagram Social Network , 2015, ArXiv.

[18] Ingmar Weber,et al. Understanding Abuse: A Typology of Abusive Language Detection Subtasks , 2017, ALW@ACL.

[19] Gianluca Stringhini,et al. Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior , 2018, ICWSM.

[20] Jun-Ming Xu,et al. Learning from Bullying Traces in Social Media , 2012, NAACL.

[21] Joachim Bingel,et al. Bridging the Gaps: Multi Task Learning for Domain Transfer of Hate Speech Detection , 2018 .

[22] Cody Buntain,et al. A Large Labeled Corpus for Online Harassment Research , 2017, WebSci.

[23] Joel R. Tetreault,et al. Abusive Language Detection in Online User Content , 2016, WWW.

[24] Sebastian Ruder,et al. Fine-tuned Language Models for Text Classification , 2018, ArXiv.

[25] Ritesh Kumar,et al. Benchmarking Aggression Identification in Social Media , 2018, TRAC@COLING 2018.

[26] Heri Ramampiaro,et al. Effective hate-speech detection in Twitter data using recurrent neural networks , 2018, Applied Intelligence.

[27] Dolf Trieschnigg,et al. Improving Cyberbullying Detection with User Context , 2013, ECIR.

[28] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[29] Henry Lieberman,et al. Modeling the Detection of Textual Cyberbullying , 2011, The Social Mobile Web.

[30] David Robinson,et al. Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network , 2018, ESWC.

[31] Joel R. Tetreault,et al. Do Characters Abuse More Than Words? , 2016, SIGDIAL Conference.

[32] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[33] Matthew Leighton Williams,et al. Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making , 2015 .

[34] Jan Snajder,et al. Cross-Domain Detection of Abusive Language Online , 2018, ALW.

[35] Ingmar Weber,et al. Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[36] Björn Gambäck,et al. Using Convolutional Neural Networks to Classify Hate-Speech , 2017, ALW@ACL.

[37] Björn Gambäck,et al. A Platform Agnostic Dual-Strand Hate Speech Detector , 2019 .

[38] Mauro Conti,et al. All You Need is "Love": Evading Hate Speech Detection , 2018, ArXiv.

[39] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[40] Lei Gao,et al. Detecting Online Hate Speech Using Context Aware Models , 2017, RANLP.

[41] Lucas Dixon,et al. Ex Machina: Personal Attacks Seen at Scale , 2016, WWW.

[42] Alex Nikolov,et al. Nikolov-Radivchev at SemEval-2019 Task 6: Offensive Tweet Classification with BERT and Ensembles , 2019, *SEMEVAL.

[43] Fabrício Benevenuto,et al. Analyzing the Targets of Hate in Online Social Media , 2016, ICWSM.

[44] Julia Hirschberg,et al. Detecting Hate Speech on the World Wide Web , 2012 .