FairPy: A Toolkit for Evaluation of Social Biases and their Mitigation in Large Language Models
暂无分享,去创建一个
[1] Yoav Goldberg,et al. LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models , 2022, EMNLP.
[2] Jia Yuan Yu,et al. Reward modeling for mitigating toxicity in transformer-based language models , 2022, Applied Intelligence.
[3] Siva Reddy,et al. An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models , 2021, ACL.
[4] Alice H. Oh,et al. Mitigating Language-Dependent Ethnic Bias in BERT , 2021, EMNLP.
[5] Liam Magee,et al. Intersectional Bias in Causal Language Models , 2021, ArXiv.
[6] Ruslan Salakhutdinov,et al. Towards Understanding and Mitigating Social Biases in Language Models , 2021, ICML.
[7] Goran Glavas,et al. RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models , 2021, ACL.
[8] Leonardo Neves,et al. On Transferability of Bias Mitigation Effects in Language Model Fine-Tuning , 2021, NAACL.
[9] Dirk Hovy,et al. HONEST: Measuring Hurtful Sentence Completion in Language Models , 2021, NAACL.
[10] Soroush Vosoughi,et al. Mitigating Political Bias in Language Models Through Reinforced Calibration , 2021, AAAI.
[11] Timo Schick,et al. Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP , 2021, Transactions of the Association for Computational Linguistics.
[12] Elias Benussi,et al. Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models , 2021, NeurIPS.
[13] Malvina Nissim,et al. Unmasking Contextual Stereotypes: Measuring and Mitigating BERT’s Gender Bias , 2020, GEBNLP.
[14] Slav Petrov,et al. Measuring and Reducing Gendered Correlations in Pre-trained Models , 2020, ArXiv.
[15] Daniel Khashabi,et al. UNQOVERing Stereotypical Biases via Underspecified Questions , 2020, FINDINGS.
[16] Samuel R. Bowman,et al. CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models , 2020, EMNLP.
[17] Aylin Caliskan,et al. Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases , 2020, AIES.
[18] Sameer Singh,et al. Beyond Accuracy: Behavioral Testing of NLP Models with CheckList , 2020, ACL.
[19] Siva Reddy,et al. StereoSet: Measuring stereotypical bias in pretrained language models , 2020, ACL.
[20] Orestis Papakyriakopoulos,et al. Bias in word embeddings , 2020, FAT*.
[21] Jie M. Zhang,et al. Automatic Testing and Improvement of Machine Translation , 2019, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).
[22] Noah A. Smith,et al. Evaluating Gender Bias in Machine Translation , 2019, ACL.
[23] Yusu Qian,et al. Reducing Gender Bias in Word-Level Language Models with a Gender-Equalizing Loss Function , 2019, ACL.
[24] Chandler May,et al. On Measuring Social Biases in Sentence Encoders , 2019, NAACL.
[25] Luís C. Lamb,et al. Assessing gender bias in machine translation: a case study with Google Translate , 2018, Neural Computing and Applications.
[26] Pascale Fung,et al. Reducing Gender Bias in Abusive Language Detection , 2018, EMNLP.
[27] Jieyu Zhao,et al. Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods , 2018, NAACL.
[28] Joanna Bryson,et al. Semantics derived automatically from language corpora contain human-like biases , 2016, Science.
[29] Adam Tauman Kalai,et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.
[30] R. Beran. Minimum Hellinger distance estimates for parametric models , 1977 .
[31] Federico Bianchi,et al. Pipelines for Social Bias Testing of Large Language Models , 2022, BIGSCIENCE.
[32] Navid Rekabsaz,et al. Parameter Efficient Diff Pruning for Bias Mitigation , 2022, ArXiv.