Measuring Social Bias in Knowledge Graph Embeddings

It has recently been shown that word embeddings encode social biases, with a harmful impact on downstream tasks. However, to this point there has been no similar work done in the field of graph embeddings. We present the first study on social bias in knowledge graph embeddings, and propose a new metric suitable for measuring such bias. We conduct experiments on Wikidata and Freebase, and show that, as with word embeddings, harmful social biases related to professions are encoded in the embeddings with respect to gender, religion, ethnicity and nationality. For example, graph embeddings encode the information that men are more likely to be bankers, and women more likely to be homekeepers. As graph embeddings become increasingly utilized, we suggest that it is important the existence of such biases are understood and steps taken to mitigate their impact.

[1]  Alan W Black,et al.  Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings , 2019, NAACL.

[2]  Roy Schwartz,et al.  Knowledge Enhanced Contextual Word Representations , 2019, EMNLP/IJCNLP.

[3]  Daniel Jurafsky,et al.  Word embeddings quantify 100 years of gender and ethnic stereotypes , 2017, Proceedings of the National Academy of Sciences.

[4]  Yoav Goldberg,et al.  Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them , 2019, NAACL-HLT.

[5]  Sameer Singh,et al.  Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling , 2019, ACL.

[6]  James Zou,et al.  AI can be sexist and racist — it’s time to make it fair , 2018, Nature.

[7]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[8]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[9]  Noah A. Smith,et al.  Evaluating Gender Bias in Machine Translation , 2019, ACL.

[10]  Malvina Nissim,et al.  Fair is Better than Sensational: Man is to Doctor as Woman is to Doctor , 2019, Computational Linguistics.

[11]  David García,et al.  It's a Man's Wikipedia? Assessing Gender Inequality in an Online Encyclopedia , 2015, ICWSM.

[12]  D. Best,et al.  Measuring sex stereotypes: A multination study, Rev. ed. , 1990 .

[13]  D. Best,et al.  Measuring Sex Stereotypes: A Multination Study , 1982 .

[14]  Chandler May,et al.  On Measuring Social Biases in Sentence Encoders , 2019, NAACL.

[15]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[16]  Jieyu Zhao,et al.  Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods , 2018, NAACL.

[17]  Mark Graham,et al.  Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty , 2014 .

[18]  William L. Hamilton,et al.  Compositional Fairness Constraints for Graph Embeddings , 2019, ICML.

[19]  Rachel Rudinger,et al.  Gender Bias in Coreference Resolution , 2018, NAACL.

[20]  Alexander Peysakhovich,et al.  PyTorch-BigGraph: A Large-scale Graph Embedding System , 2019, SysML.

[21]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[22]  Noe Casas,et al.  Evaluating the Underlying Gender Bias in Contextualized Word Embeddings , 2019, Proceedings of the First Workshop on Gender Bias in Natural Language Processing.

[23]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[24]  Zeyu Li,et al.  Learning Gender-Neutral Word Embeddings , 2018, EMNLP.

[25]  Jian-Yun Nie,et al.  RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space , 2018, ICLR.

[26]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[27]  Eduardo Graells-Garrido,et al.  Women through the glass ceiling: gender asymmetries in Wikipedia , 2016, EPJ Data Science.