Quantifying Gender Bias in Different Corpora

Word embedding models have been shown to be effective in performing a wide variety of Natural Language Processing (NLP) tasks such as identifying audiences for web advertisements, parsing resumés to select promising job candidates, and translating documents from one language to another. However, it has been demonstrated that NLP systems learn gender bias from the corpora of documents on which they are trained. It is increasingly common for pre-trained models to be used as a starting point for building applications in a wide range of areas including critical decision making applications. It is also very easy to use a pre-trained model as the basis for a new application without careful consideration of the original nature of the training set. In this paper, we quantify the degree to which gender bias differs with the corpora used for training. We look especially at the impact of starting with a pre-trained model and fine-tuning with additional data. Specifically, we calculate a measure of direct gender bias on several pre-trained models including BERT’s Wikipedia and Book corpus models as well as on several fine-tuned General Language Understanding Evaluation (GLUE) benchmarks. In addition, we evaluate the bias from several more extreme corpora including the Jigsaw identity toxic dataset that includes toxic speech biased against race, gender, religion, and disability and the RtGender dataset that includes speech specifically labelled by gender. Our results reveal that the direct gender bias of the Jigsaw toxic identity dataset is surprisingly close to that of the base pre-trained Google model, but the RtGender dataset has significantly higher direct gender bias than the base model. When the bias learned by an NLP system can vary significantly with the corpora used for training, it becomes important to consider and report these details, especially for use in critical decision-making applications.

[1]  Bhaskar Mitra,et al.  Improving Document Ranking with Dual Word Embeddings , 2016, WWW.

[2]  Hector J. Levesque,et al.  The Winograd Schema Challenge , 2011, AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.

[3]  Omer Levy,et al.  GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[4]  知秀 柴田 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[5]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[6]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[7]  Samuel R. Bowman,et al.  Grammatical Analysis of Pretrained Sentence Encoders with Acceptability Judgments , 2019, ArXiv.

[8]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[9]  Samuel R. Bowman,et al.  Human vs. Muppet: A Conservative Estimate of Human Performance on the GLUE Benchmark , 2019, ACL.

[10]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[12]  Yulia Tsvetkov,et al.  RtGender: A Corpus for Studying Differential Responses to Gender , 2018, LREC.