论文信息 - On Measuring Social Biases in Sentence Encoders - 字舞流文

On Measuring Social Biases in Sentence Encoders

The Word Embedding Association Test shows that GloVe and word2vec word embeddings exhibit human-like implicit biases based on gender, race, and other social constructs (Caliskan et al., 2017). Meanwhile, research on learning reusable text representations has begun to explore sentence-level texts, with some sentence encoders seeing enthusiastic adoption. Accordingly, we extend the Word Embedding Association Test to measure bias in sentence encoders. We then test several sentence encoders, including state-of-the-art methods such as ELMo and BERT, for the social biases studied in prior work and two important biases that are difficult or impossible to test at the word level. We observe mixed results including suspicious patterns of sensitivity that suggest the test’s assumptions may not hold in general. We conclude by proposing directions for future work on measuring bias in sentence encoders.

Chandler May | Alex Wang | Shikha Bordia | Samuel R. Bowman | Rachel Rudinger | Chandler May | Alex Wang | Shikha Bordia | Rachel Rudinger

[1] S. Bem. The measurement of psychological androgyny. , 1974, Journal of consulting and clinical psychology.

[2] S. Holm. A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[3] K. Crenshaw. Demarginalizing the Intersection of Race and Sex: A Black Feminist Critique of Antidiscrimination Doctrine, Feminist Theory and Antiracist Politics , 1989 .

[4] A. Greenwald,et al. Measuring individual differences in implicit cognition: the implicit association test. , 1998, Journal of personality and social psychology.

[5] Brian A. Nosek,et al. Math Male , Me Female , Therefore Math Me , 2002 .

[6] Brian A. Nosek,et al. Math = male, me = female, therefore math ≠ me. , 2002 .

[7] Brian A. Nosek,et al. Understanding and using the implicit association test: I. An improved scoring algorithm. , 2003, Journal of personality and social psychology.

[8] P. Stone,et al. Fast-Track Women and the “Choice” to Stay Home , 2004 .

[9] M. Heilman,et al. Penalties for success: reactions to women who succeed at male gender-typed tasks. , 2004, The Journal of applied psychology.

[10] M. Banaji,et al. PREDICTIVE VALIDITY OF THE IAT 1 RUNNING HEAD : PREDICTIVE VALIDITY OF THE IAT Understanding and Using the Implicit Association Test : III . Meta-analysis of Predictive Validity , 2006 .

[11] D. Madison. Crazy Patriotism and Angry (Post)Black Women , 2009 .

[12] Aurélie Herbelot,et al. Distributional techniques for philosophical enquiry , 2012, LaTeCH@EACL.

[13] K. Mitchell. Raunch versus prude: contemporary sex blogs and erotic memoirs by women , 2012 .

[14] Latanya Sweeney,et al. Discrimination in online ad delivery , 2013, CACM.

[15] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16] Toni Pressley-Sanon. sister citizen: shame, stereotypes, and black women in America , 2013 .

[17] G. N. Rider,et al. Black sexual politics: African Americans, gender, and the new racism , 2014, Culture, health & sexuality.

[18] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[19] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20] Hal Daumé,et al. Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[21] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.

[22] Adam Tauman Kalai,et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[23] Andra Gillespie. Race, Perceptions of Femininity, and the Power of the First Lady: A Comparative Analysis , 2016 .

[24] Rachael Tatman,et al. Gender and Dialect Bias in YouTube’s Automatic Captions , 2017, EthNLP@EACL.

[25] Chandler May,et al. Social Bias in Elicited Natural Language Inferences , 2017, EthNLP@EACL.

[26] Arvind Narayanan,et al. Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[27] Holger Schwenk,et al. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[28] Nan Hua,et al. Universal Sentence Encoder for English , 2018, EMNLP.

[29] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[30] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[31] Saif Mohammad,et al. Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems , 2018, *SEMEVAL.

[32] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[33] Natalie Schluter,et al. The glass ceiling in NLP , 2018, EMNLP.

[34] Christopher Joseph Pal,et al. Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning , 2018, ICLR.

[35] Daniel Jurafsky,et al. Word embeddings quantify 100 years of gender and ethnic stereotypes , 2017, Proceedings of the National Academy of Sciences.

[36] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .

[37] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.