Mitigating Political Bias in Language Models Through Reinforced Calibration
暂无分享,去创建一个
Soroush Vosoughi | Chenyan Jia | Ruibo Liu | Jason Wei | Lili Wang | Guangxuan Xu | Soroush Vosoughi | Lili Wang | Ruibo Liu | Chenyan Jia | Jason Wei | Guangxuan Xu | Jason Wei
[1] Soroush Vosoughi,et al. Data Boost: Text Data Augmentation through Reinforcement Learning Guided Conditional Generation , 2020, EMNLP.
[2] Po-Sen Huang,et al. Reducing Sentiment Bias in Language Models via Counterfactual Evaluation , 2019, FINDINGS.
[3] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[4] Tie-Yan Liu,et al. Incorporating BERT into Neural Machine Translation , 2020, ICLR.
[5] Spencer Stone,et al. Mississippi , 1897, The Journal of comparative medicine and veterinary archives.
[6] Nanyun Peng,et al. Towards Controllable Biases in Language Generation , 2020, FINDINGS.
[7] Martin Wattenberg,et al. Visualizing and Measuring the Geometry of BERT , 2019, NeurIPS.
[8] Soroush Vosoughi,et al. A Transformer-based Framework for Neutralizing and Reversing the Political Polarity of News Articles , 2021, Proc. ACM Hum. Comput. Interact..
[9] Jun Sakuma,et al. Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.
[10] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[11] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[12] Soroush Vosoughi,et al. Political Depolarization of News Articles Using Attribute-aware Word Embeddings , 2021, ICWSM.
[13] Geoff Gordon,et al. Inherent Tradeoffs in Learning Fair Representations , 2019, NeurIPS.
[14] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[15] Matt J. Kusner,et al. Counterfactual Fairness , 2017, NIPS.
[16] Blake Lemoine,et al. Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.
[17] Han Zhao,et al. Conditional Learning of Fair Representations , 2019, ICLR.
[18] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[19] Sameer Singh,et al. Universal Adversarial Triggers for Attacking and Analyzing NLP , 2019, EMNLP.
[20] Ross B. Girshick,et al. Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[22] Adam Tauman Kalai,et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.
[23] Jason Yosinski,et al. Plug and Play Language Models: A Simple Approach to Controlled Text Generation , 2020, ICLR.
[24] Ryan Cotterell,et al. It’s All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution , 2019, EMNLP.
[25] Jianfeng Gao,et al. DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation , 2020, ACL.
[26] Shai Ben-David,et al. Empirical Risk Minimization under Fairness Constraints , 2018, NeurIPS.
[27] Nanyun Peng,et al. The Woman Worked as a Babysitter: On Biases in Language Generation , 2019, EMNLP.
[28] Pascale Fung,et al. Reducing Gender Bias in Abusive Language Detection , 2018, EMNLP.
[29] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[30] Wei Shi,et al. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification , 2016, ACL.
[31] Jieyu Zhao,et al. Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods , 2018, NAACL.
[32] Xuanjing Huang,et al. Style Transformer: Unpaired Text Style Transfer without Disentangled Latent Representation , 2019, ACL.
[33] Julien Rabin,et al. Wasserstein Barycenter and Its Application to Texture Mixing , 2011, SSVM.
[34] Kenneth Heafield,et al. KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.
[35] Daniel Jurafsky,et al. Word embeddings quantify 100 years of gender and ethnic stereotypes , 2017, Proceedings of the National Academy of Sciences.
[36] Inioluwa Deborah Raji,et al. Model Cards for Model Reporting , 2018, FAT.
[37] Zeyu Li,et al. Learning Gender-Neutral Word Embeddings , 2018, EMNLP.
[38] Osmar R. Zaïane,et al. Automatic Dialogue Generation with Expressed Emotions , 2018, NAACL.
[39] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[40] Noah A. Smith,et al. Evaluating Gender Bias in Machine Translation , 2019, ACL.
[41] Lei Li,et al. Towards Making the Most of BERT in Neural Machine Translation , 2020, AAAI.
[42] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[43] Shikha Bordia,et al. Identifying and Reducing Gender Bias in Word-Level Language Models , 2019, NAACL.
[44] Jianfeng Gao,et al. Few-shot Natural Language Generation for Task-Oriented Dialog , 2020, FINDINGS.
[45] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.