Detecting Cross-Geographic Biases in Toxicity Modeling on Social Media
暂无分享,去创建一个
Vinodkumar Prabhakaran | Sayan Ghosh | Dylan Baker | David Jurgens | David Jurgens | Vinodkumar Prabhakaran | Sayan Ghosh | Dylan Baker
[1] Siva Reddy,et al. StereoSet: Measuring stereotypical bias in pretrained language models , 2020, ACL.
[2] Zeerak Waseem,et al. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.
[3] Bernard J. Jansen,et al. Online Hate Ratings Vary by Extremes: A Statistical Analysis , 2019, CHIIR.
[4] Yejin Choi,et al. Challenges in Automated Debiasing for Toxic Language Detection , 2021, EACL.
[5] Ashiqur KhudaBukhsh,et al. The Non-native Speaker Aspect: Indian English in Social Media , 2020, WNUT.
[6] Xiang Ren,et al. Fair Hate Speech Detection through Evaluation of Social Group Counterfactuals , 2020, ArXiv.
[7] Radha Poovendran,et al. Deceiving Google's Perspective API Built for Detecting Toxic Comments , 2017, ArXiv.
[8] Pascale Fung,et al. Reducing Gender Bias in Abusive Language Detection , 2018, EMNLP.
[9] Emily Denton,et al. Social Biases in NLP Models as Barriers for Persons with Disabilities , 2020, ACL.
[10] Ankit Kumar,et al. User Generated Data: Achilles' heel of BERT , 2020, ArXiv.
[11] Bernard J. Jansen,et al. Online Hate Interpretation Varies by Country, But More by Individual: A Statistical Analysis Using Crowdsourced Ratings , 2018, 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS).
[12] Lucy Vasserman,et al. Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification , 2019, WWW.
[13] Margaret Mitchell,et al. Perturbation Sensitivity Analysis to Detect Unintended Model Biases , 2019, EMNLP.
[14] David Bamman,et al. Gender identity and lexical variation in social media , 2012, 1210.4567.
[15] Kathleen McKeown,et al. Automatically Inferring Gender Associations from Language , 2019, EMNLP.
[16] Ralf Krestel,et al. Challenges for Toxic Comment Classification: An In-Depth Error Analysis , 2018, ALW.
[17] Rachel Rudinger,et al. “You Are Grounded!”: Latent Name Artifacts in Pre-trained Language Models , 2020, EMNLP.
[18] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[19] Antonios Anastasopoulos,et al. Towards Robust Toxic Content Classification , 2019, ArXiv.
[20] Lucy Vasserman,et al. Measuring and Mitigating Unintended Bias in Text Classification , 2018, AIES.
[21] Ben Hutchinson,et al. Re-imagining Algorithmic Fairness in India and Beyond , 2021, FAccT.
[22] S. Fiske. Prejudices in Cultural Contexts: Shared Stereotypes (Gender, Age) Versus Variable Stereotypes (Race, Ethnicity, Religion) , 2017, Perspectives on psychological science : a journal of the Association for Psychological Science.
[23] Daniel Jurafsky,et al. Word embeddings quantify 100 years of gender and ethnic stereotypes , 2017, Proceedings of the National Academy of Sciences.
[24] Steven Bird,et al. NLTK: The Natural Language Toolkit , 2002, ACL.
[25] Chandler May,et al. On Measuring Social Biases in Sentence Encoders , 2019, NAACL.
[26] Ingmar Weber,et al. Racial Bias in Hate Speech and Abusive Language Detection Datasets , 2019, Proceedings of the Third Workshop on Abusive Language Online.
[27] Mauro Conti,et al. All You Need is "Love": Evading Hate Speech Detection , 2018, ArXiv.
[28] Burt L. Monroe,et al. Fightin' Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict , 2008, Political Analysis.
[29] Ankur Taly,et al. Counterfactual Fairness in Text Classification through Robustness , 2018, AIES.
[30] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[31] Yejin Choi,et al. The Risk of Racial Bias in Hate Speech Detection , 2019, ACL.
[32] Arvind Narayanan,et al. Semantics derived automatically from language corpora contain human-like biases , 2016, Science.
[33] Alan W Black,et al. Measuring Bias in Contextualized Word Representations , 2019, Proceedings of the First Workshop on Gender Bias in Natural Language Processing.