Towards generalisable hate speech detection: a review on obstacles and solutions
暂无分享,去创建一个
[1] Marco Guerini,et al. CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech , 2019, ACL.
[2] Helen Yannakoudakis,et al. Neural Character-based Composition Models for Abuse Detection , 2018, ALW.
[3] Omer Levy,et al. Dependency-Based Word Embeddings , 2014, ACL.
[4] Kevin Gimpel,et al. From Paraphrase Database to Compositional Paraphrase Model and Back , 2015, Transactions of the Association for Computational Linguistics.
[5] Endang Wahyu Pamungkas,et al. Cross-domain and Cross-lingual Abusive Language Detection: A Hybrid Approach with Deep Learning and a Multilingual Lexicon , 2019, ACL.
[6] Paula Fortuna,et al. How well do hate speech, toxicity, abusive and offensive language classification models generalize across datasets? , 2021, Inf. Process. Manag..
[7] Yaser Al-Onaizan,et al. Neural Word Decomposition Models for Abusive Language Detection , 2019, ArXiv.
[8] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.
[9] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[10] Paolo Rosso,et al. AMI @ EVALITA2020: Automatic Misogyny Identification , 2020, EVALITA.
[11] Björn Gambäck,et al. Studying Generalisability across Abusive Language Detection Datasets , 2019, CoNLL.
[12] Xiang Ren,et al. Contextualizing Hate Speech Classifiers with Post-hoc Explanation , 2020, ACL.
[13] Hind Saleh Alatawi,et al. Detecting White Supremacist Hate Speech Using Domain Specific Word Embedding With Deep Learning and BERT , 2020, IEEE Access.
[14] David Robinson,et al. Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network , 2018, ESWC.
[15] Valerio Basile,et al. It's the End of the Gold Standard as we Know it. On the Impact of Pre-aggregation on the Evaluation of Highly Subjective Tasks , 2020, DP@AI*IA.
[16] Joachim Bingel,et al. Bridging the Gaps: Multi Task Learning for Domain Transfer of Hate Speech Detection , 2018 .
[17] Rui Zhao,et al. Automatic detection of cyberbullying on social networks based on bullying features , 2016, ICDCN.
[18] Yue Ning,et al. Empirical Analysis of Multi-Task Learning for Reducing Identity Bias in Toxic Comment Detection , 2020, ICWSM.
[19] Viviana Patti,et al. Resources and benchmark corpora for hate speech detection: a systematic review , 2020, Language Resources and Evaluation.
[20] Paolo Rosso,et al. Overview of the Evalita 2018 Task on Automatic Misogyny Identification (AMI) , 2018, EVALITA@CLiC-it.
[21] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[22] Amit Awekar,et al. Deep Learning for Detecting Cyberbullying Across Multiple Social Media Platforms , 2018, ECIR.
[23] Taha Yasseri,et al. Detecting weak and strong Islamophobic hate speech on social media , 2018, Journal of Information Technology & Politics.
[24] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[25] Goran Glavaš,et al. XHate-999: Analyzing and Detecting Abusive Language Across Domains and Languages , 2020, COLING.
[26] Scott A. Hale,et al. Challenges and frontiers in abusive content detection , 2019, Proceedings of the Third Workshop on Abusive Language Online.
[27] Viviana Patti,et al. HurtBERT: Incorporating Lexical Features with BERT for the Detection of Abusive Language , 2020, ALW.
[28] Francesca Gasparini,et al. Detecting Sexist MEME On The Web: A Study on Textual and Visual Cues , 2019, 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW).
[29] Hal Daumé,et al. Frustratingly Easy Domain Adaptation , 2007, ACL.
[30] Benjamin Heinzerling,et al. BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages , 2017, LREC.
[31] Pascale Fung,et al. Reducing Gender Bias in Abusive Language Detection , 2018, EMNLP.
[32] Lei Gao,et al. Recognizing Explicit and Implicit Hate Speech Using a Weakly Supervised Two-path Bootstrapping Approach , 2017, IJCNLP.
[33] Xiang Ren,et al. Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models , 2020, ICLR.
[34] Ritesh Kumar,et al. Aggression-annotated Corpus of Hindi-English Code-mixed Data , 2018, LREC.
[35] Pete Burnap,et al. The Enemy Among Us , 2013, ACM Trans. Web.
[36] Vasudeva Varma,et al. Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.
[37] Brendan T. O'Connor,et al. Racial Disparity in Natural Language Processing: A Case Study of Social Media African-American English , 2017, ArXiv.
[38] Viviana Patti,et al. Misogyny Detection in Twitter: a Multilingual and Cross-Domain Study , 2020, Inf. Process. Manag..
[39] Sandra Kübler,et al. Investigating Sampling Bias in Abusive Language Detection , 2020, ALW.
[40] Dit-Yan Yeung,et al. Comparative Evaluation of Label Agnostic Selection Bias in Multilingual Hate Speech Datasets , 2020, EMNLP.
[41] Leon Derczynski,et al. Directions in Abusive Language Training Data: Garbage In, Garbage Out , 2020, ArXiv.
[42] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .
[43] Michael Wiegand,et al. A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.
[44] Yue Ning,et al. Empirical Analysis of Multi-Task Learning for Reducing Model Bias in Toxic Comment Detection , 2019, ArXiv.
[45] Zeerak Waseem,et al. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.
[46] Jan Snajder,et al. Cross-Domain Detection of Abusive Language Online , 2018, ALW.
[47] Preslav Nakov,et al. SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval) , 2019, *SEMEVAL.
[48] Julia Hirschberg,et al. Detecting Hate Speech on the World Wide Web , 2012 .
[49] Josef Ruppenhofer,et al. Treebanking user-generated content: a UD based overview of guidelines, corpora and unified recommendations , 2020, Language Resources and Evaluation.
[50] Noel Crespi,et al. Hate speech detection and racial bias mitigation in social media based on BERT model , 2020, PloS one.
[51] Tommaso Caselli,et al. I Feel Offended, Don’t Be Abusive! Implicit/Explicit Messages in Offensive and Abusive Language , 2020, LREC.
[52] Maite Taboada,et al. The SFU Opinion and Comments Corpus: A Corpus for the Analysis of Online News Comments , 2019, Corpus pragmatics : international journal of corpus linguistics and pragmatics.
[53] Tomas Mikolov,et al. Bag of Tricks for Efficient Text Classification , 2016, EACL.
[54] Mauro Conti,et al. All You Need is "Love": Evading Hate Speech Detection , 2018, ArXiv.
[55] Yulia Tsvetkov,et al. Demoting Racial Bias in Hate Speech Detection , 2020, SOCIALNLP.
[56] Rui Cao,et al. DeepHate: Hate Speech Detection via Multi-Faceted Text Representations , 2020, WebSci.
[57] Hao Chen,et al. A Comparison of Classical Versus Deep Learning Techniques for Abusive Content Detection on Social Media Sites , 2018, SocInfo.
[58] Vasudeva Varma,et al. FERMI at SemEval-2019 Task 5: Using Sentence embeddings to Identify Hate Speech Against Immigrants and Women in Twitter , 2019, *SEMEVAL.
[59] Ingmar Weber,et al. Understanding Abuse: A Typology of Abusive Language Detection Subtasks , 2017, ALW@ACL.
[60] Guillaume Lample,et al. Word Translation Without Parallel Data , 2017, ICLR.
[61] Hao Chen,et al. The Use of Deep Learning Distributed Representations in the Identification of Abusive Text , 2019, ICWSM.
[62] Scott A. Hale,et al. Detecting East Asian Prejudice on Social Media , 2020, ALW.
[63] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[64] Liviu P. Dinu,et al. On Transfer Learning for Detecting Abusive Language Online , 2019, IWANN.
[65] Jorge Pérez,et al. Hate speech detection is not as easy as you may think: A closer look at model validation (extended version) , 2020, Inf. Syst..
[66] Jiebo Luo,et al. Determining Code Words in Euphemistic Hate Speech Using Word Embedding Networks , 2018, ALW.
[67] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[68] Tomas Mikolov,et al. Advances in Pre-Training Distributed Word Representations , 2017, LREC.
[69] Dirk Hovy,et al. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.
[70] Michele Banko,et al. A Unified Taxonomy of Harmful Content , 2020, ALW.
[71] Noel Crespi,et al. A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media , 2019, COMPLEX NETWORKS.
[72] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..
[73] Gianluca Stringhini,et al. Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior , 2018, ICWSM.
[74] Natasha Duarte,et al. Mixed Messages? The Limits of Automated Social Media Content Analysis , 2018, FAT.
[75] Ralf Krestel,et al. Challenges for Toxic Comment Classification: An In-Depth Error Analysis , 2018, ALW.
[76] Joel R. Tetreault,et al. Abusive Language Detection in Online User Content , 2016, WWW.
[77] Ritesh Kumar,et al. Benchmarking Aggression Identification in Social Media , 2018, TRAC@COLING 2018.
[78] Adam Tauman Kalai,et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.
[79] Lei Gao,et al. Detecting Online Hate Speech Using Context Aware Models , 2017, RANLP.
[80] Tommaso Caselli,et al. HateBERT: Retraining BERT for Abusive Language Detection in English , 2020, WOAH.
[81] Ziqi Zhang,et al. Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter , 2018, Semantic Web.
[82] Carlos Ortiz,et al. Intersectional Bias in Hate Speech and Abusive Language Datasets , 2020, ArXiv.
[83] Yejin Choi,et al. The Risk of Racial Bias in Hate Speech Detection , 2019, ACL.
[84] Matthew Leighton Williams,et al. The Enemy Among Us: Detecting Hate Speech with Threats Based 'Othering' Language Embeddings , 2018 .
[85] Nikos Pelekis,et al. DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis , 2017, *SEMEVAL.
[86] Dirk Hovy,et al. The Social Impact of Natural Language Processing , 2016, ACL.
[87] Preslav Nakov,et al. SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020) , 2020, SEMEVAL.
[88] Aida Mostafazadeh Davani,et al. The Gab Hate Corpus: A collection of 27k posts annotated for hate speech , 2018 .
[89] Paula Fortuna,et al. Toxic, Hateful, Offensive or Abusive? What Are We Really Classifying? An Empirical Analysis of Hate Speech Datasets , 2020, LREC.
[90] Elisabetta Fersini,et al. Unintended Bias in Misogyny Detection , 2019, 2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI).
[91] Franck Dernoncourt,et al. Multilingual Twitter Corpus and Baselines for Evaluating Demographic Bias in Hate Speech Recognition , 2020, LREC.
[92] Lucas Dixon,et al. Ex Machina: Personal Attacks Seen at Scale , 2016, WWW.
[93] Cody Buntain,et al. A Large Labeled Corpus for Online Harassment Research , 2017, WebSci.
[94] Nan Hua,et al. Universal Sentence Encoder for English , 2018, EMNLP.
[95] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[96] Brendan T. O'Connor,et al. Demographic Dialectal Variation in Social Media: A Case Study of African-American English , 2016, EMNLP.
[97] Ji Ho Park,et al. Finding Good Representations of Emotions for Text Classification , 2018, ArXiv.
[98] Saif Mohammad,et al. Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems , 2018, *SEMEVAL.
[99] Yi-Shin Chen,et al. Surfacing contextual hate speech words within social media , 2017, ArXiv.
[100] Emily Ahn,et al. Finding Microaggressions in the Wild: A Case for Locating Elusive Phenomena in Social Media Posts , 2019, EMNLP.
[101] Michael Wiegand,et al. Detection of Abusive Language: the Problem of Biased Datasets , 2019, NAACL.
[102] Peter Norvig,et al. The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.
[103] Preslav Nakov,et al. Predicting the Type and Target of Offensive Posts in Social Media , 2019, NAACL.
[104] A. Al-Hassan,et al. DETECTION OF HATE SPEECH IN SOCIAL NETWORKS: A SURVEY ON MULTILINGUAL CORPUS , 2019, Computer Science & Information Technology(CS & IT).
[105] Shubhanshu Mishra,et al. 3Idiots at HASOC 2019: Fine-tuning Transformer Neural Networks for Hate Speech Identification in Indo-European Languages , 2019, FIRE.
[106] Helen Yannakoudakis,et al. Tackling Online Abuse: A Survey of Automated Abuse Detection Methods , 2019, ArXiv.
[107] Gianluca Stringhini,et al. Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words , 2017, ALW@ACL.
[108] Paolo Rosso,et al. Overview of the Task on Automatic Misogyny Identification at IberEval 2018 , 2018, IberEval@SEPLN.
[109] Ingmar Weber,et al. Racial Bias in Hate Speech and Abusive Language Detection Datasets , 2019, Proceedings of the Third Workshop on Abusive Language Online.
[110] Vasudeva Varma,et al. Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations , 2019, WWW.
[111] Danah Boyd,et al. Fairness and Abstraction in Sociotechnical Systems , 2019, FAT.
[112] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.
[113] Ingmar Weber,et al. Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.
[114] Liang Zou,et al. NULI at SemEval-2019 Task 6: Transfer Learning for Offensive Language Detection using Bidirectional Transformers , 2019, *SEMEVAL.
[115] Yejin Choi,et al. Social Bias Frames: Reasoning about Social and Power Implications of Language , 2020, ACL.
[116] Mai ElSherief,et al. Leveraging Intra-User and Inter-User Representation Learning for Automated Hate Speech Detection , 2018, NAACL.
[117] Paolo Rosso,et al. SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter , 2019, *SEMEVAL.
[118] Björn Gambäck,et al. A Platform Agnostic Dual-Strand Hate Speech Detector , 2019 .
[119] Stan Matwin,et al. Offensive Language Detection Using Multi-level Classification , 2010, Canadian Conference on AI.
[120] Jacob Eisenstein,et al. Mimicking Word Embeddings using Subword RNNs , 2017, EMNLP.
[121] Prasenjit Majumder,et al. Overview of the HASOC track at FIRE 2019: Hate Speech and Offensive Content Identification in Indo-European Languages , 2019, FIRE.
[122] Lucy Vasserman,et al. Measuring and Mitigating Unintended Bias in Text Classification , 2018, AIES.
[123] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[124] Sérgio Nunes,et al. A Survey on Automatic Detection of Hate Speech in Text , 2018, ACM Comput. Surv..
[125] Veselin Stoyanov,et al. Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.
[126] Kyomin Jung,et al. Comparative Studies of Detecting Abusive Language on Twitter , 2018, ALW.
[127] Manish Shrivastava,et al. Degree based Classification of Harmful Speech using Twitter Data , 2018, TRAC@COLING 2018.
[128] Ona de Gibert,et al. Hate Speech Dataset from a White Supremacy Forum , 2018, ALW.
[129] Eric Gilbert,et al. VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.
[130] Rachael Tatman,et al. Gender and Dialect Bias in YouTube’s Automatic Captions , 2017, EthNLP@EACL.