A Transparent Framework for Evaluating Unintended Demographic Bias in Word Embeddings

Word embedding models have gained a lot of traction in the Natural Language Processing community, however, they suffer from unintended demographic biases. Most approaches to evaluate these biases rely on vector space based metrics like the Word Embedding Association Test (WEAT). While these approaches offer great geometric insights into unintended biases in the embedding vector space, they fail to offer an interpretable meaning for how the embeddings could cause discrimination in downstream NLP applications. In this work, we present a transparent framework and metric for evaluating discrimination across protected groups with respect to their word embedding bias. Our metric (Relative Negative Sentiment Bias, RNSB) measures fairness in word embeddings via the relative negative sentiment associated with demographic identity terms from various protected groups. We show that our framework and metric enable useful analysis into the bias in word embeddings.

[1]  James R. Glass,et al.  Automatic speech recognition of Arabic multi-genre broadcast media , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[2]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[3]  Lucy Vasserman,et al.  Measuring and Mitigating Unintended Bias in Text Classification , 2018, AIES.

[4]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5]  Amir Bakarov,et al.  A Survey of Word Embeddings Evaluation Methods , 2018, ArXiv.

[6]  Daniel Jurafsky,et al.  Word embeddings quantify 100 years of gender and ethnic stereotypes , 2017, Proceedings of the National Academy of Sciences.

[7]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[8]  Brendan T. O'Connor,et al.  Racial Disparity in Natural Language Processing: A Case Study of Social Media African-American English , 2017, ArXiv.

[9]  Philip Resnik,et al.  Political Ideology Detection Using Recursive Neural Networks , 2014, ACL.

[10]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[11]  Adam Tauman Kalai,et al.  Quantifying and Reducing Stereotypes in Word Embeddings , 2016, ArXiv.

[12]  Dirk Hovy,et al.  The Social Impact of Natural Language Processing , 2016, ACL.

[13]  Saif Mohammad,et al.  Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems , 2018, *SEMEVAL.

[14]  John H. L. Hansen,et al.  Improving speech recognition using limited accent diverse British English training data with deep neural networks , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[15]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[16]  Daniel Jurafsky,et al.  Linguistic Models for Analyzing and Detecting Biased Language , 2013, ACL.

[17]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[18]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[19]  Jieyu Zhao,et al.  Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints , 2017, EMNLP.

[20]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[21]  Joanna Bryson,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[22]  Rachael Tatman,et al.  Gender and Dialect Bias in YouTube’s Automatic Captions , 2017, EthNLP@EACL.

[23]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.