Marked Attribute Bias in Natural Language Inference

Reporting and providing test sets for harmful bias in NLP applications is essential for building a robust understanding of the current problem. We present a new observation of gender bias in a downstream NLP application: marked attribute bias in natural language inference. Bias in downstream applications can stem from training data, word embeddings, or be amplified by the model in use. However, focusing on biased word embeddings is potentially the most impactful first step due to their universal nature. Here we seek to understand how the intrinsic properties of word embeddings contribute to this observed marked attribute effect, and whether current post-processing methods address the bias successfully. An investigation of the current debiasing landscape reveals two open problems: none of the current debiased embeddings mitigate the marked attribute error, and none of the intrinsic bias measures are predictive of the marked attribute effect. By noticing that a new type of intrinsic bias measure correlates meaningfully with the marked attribute effect, we propose a new postprocessing debiasing scheme for static word embeddings. The proposed method applied to existing embeddings achieves new best results on the marked attribute bias test set. See https://github.com/hillary-dawkins/MAB.

[1]  Yoav Goldberg,et al.  Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them , 2019, NAACL-HLT.

[2]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[3]  Wojciech Czarnecki,et al.  How to evaluate word embeddings? On importance of data efficiency and simple supervised tasks , 2017, ArXiv.

[4]  Vicente Ordonez,et al.  Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation , 2020, ACL.

[5]  Jieyu Zhao,et al.  Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods , 2018, NAACL.

[6]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[7]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[8]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[9]  Daniel J. Lizotte,et al.  Decision-Directed Data Decomposition , 2019, ArXiv.

[10]  Vivek Srikumar,et al.  OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings , 2020, EMNLP.

[11]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[12]  Zeyu Li,et al.  Learning Gender-Neutral Word Embeddings , 2018, EMNLP.

[13]  Goran Glavas,et al.  A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces , 2020, AAAI.

[14]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[15]  Yoav Goldberg,et al.  Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection , 2020, ACL.

[16]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[17]  Wen-tau Yih,et al.  Measuring Word Relatedness Using Heterogeneous Vector Space Models , 2012, HLT-NAACL.

[18]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[19]  Danushka Bollegala,et al.  Gender-preserving Debiasing for Pre-trained Word Embeddings , 2019, ACL.

[20]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[21]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[22]  Mai ElSherief,et al.  Mitigating Gender Bias in Natural Language Processing: Literature Review , 2019, ACL.

[23]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[24]  Vivek Srikumar,et al.  On Measuring and Mitigating Biased Inferences of Word Embeddings , 2019, AAAI.

[25]  Felix Hill,et al.  SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[26]  Tanmoy Chakraborty,et al.  Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings , 2020, Transactions of the Association for Computational Linguistics.