Semantics derived automatically from language corpora contain human-like biases

Artificial intelligence and machine learning are in a period of astounding growth. However, there are concerns that these technologies may be used, either with or without intention, to perpetuate the prejudice and unfairness that unfortunately characterizes many human institutions. Here we show for the first time that human-like semantic biases result from the application of standard machine learning to ordinary language---the same sort of language humans are exposed to every day. We replicate a spectrum of standard human biases as exposed by the Implicit Association Test and other well-known psychological studies. We replicate these using a widely used, purely statistical machine-learning model---namely, the GloVe word embedding---trained on a corpus of text from the Web. Our results indicate that language itself contains recoverable and accurate imprints of our historic biases, whether these are morally neutral as towards insects or flowers, problematic as towards race or gender, or even simply veridical, reflecting the status quo for the distribution of gender with respect to careers or first names. These regularities are captured by machine learning along with the rest of semantics. In addition to our empirical findings concerning language, we also contribute new methods for evaluating bias in text, the Word Embedding Association Test (WEAT) and the Word Embedding Factual Association Test (WEFAT). Our results have implications not only for AI and machine learning, but also for the fields of psychology, sociology, and human ethics, since they raise the possibility that mere exposure to everyday language can account for the biases we replicate here.

[1]  M. Oswald,et al.  Norman Stanley Fletcher and the case of the proprietary algorithmic risk assessment. , 2016 .

[2]  Katherine D. Kinzler,et al.  The native language of social cognition , 2007, Proceedings of the National Academy of Sciences.

[3]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4]  Cecilia,et al.  Are Emily and Greg More Employable Than Lakisha and Jamal ? A Field Experiment on Labor Market Discrimination , 2007 .

[5]  Latanya Sweeney,et al.  Discrimination in online ad delivery , 2013, CACM.

[6]  Brian A. Nosek,et al.  National differences in gender–science stereotypes predict national sex differences in science and math achievement , 2009, Proceedings of the National Academy of Sciences.

[7]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[8]  David G. Rand,et al.  Intuition, deliberation, and the evolution of cooperation , 2016, Proceedings of the National Academy of Sciences.

[9]  Brian A. Nosek,et al.  Math = male, me = female, therefore math ≠ me. , 2002 .

[10]  V.W. Zue,et al.  The use of speech knowledge in automatic speech recognition , 1985, Proceedings of the IEEE.

[11]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[12]  Will Lowe,et al.  Meaning and the Mental Lexicon , 1997, IJCAI.

[13]  L. Winner DO ARTIFACTS HAVE (cid:1) POLITICS? , 2022 .

[14]  Antonio S Silva,et al.  Inter-Group Conflict and Cooperation: Field Experiments Before, During and After Sectarian Riots in Northern Ireland , 2015, Front. Psychol..

[15]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[16]  M. Banaji,et al.  Implicit race attitudes predict trustworthiness judgments and economic trust decisions , 2011, Proceedings of the National Academy of Sciences.

[17]  Marc Hanheide,et al.  Robot task planning and explanation in open and uncertain worlds , 2017, Artif. Intell..

[18]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[19]  Christopher T. Lowenkamp,et al.  False Positives, False Negatives, and False Analyses: A Rejoinder to "Machine Bias: There's Software Used across the Country to Predict Future Criminals. and It's Biased against Blacks" , 2016 .

[20]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[21]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[22]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[23]  K. Crawford Artificial Intelligence's White Guy Problem , 2016 .

[24]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[25]  Daniel J. Lee Racial bias and the validity of the Implicit Association Test , 2016 .

[26]  Brian A. Nosek,et al.  Math Male , Me Female , Therefore Math Me , 2002 .

[27]  A. Greenwald,et al.  With malice toward none and charity for some: ingroup favoritism enables discrimination. , 2014, The American psychologist.

[28]  Yevgeniy B. Sirotin,et al.  Temporal associations and prior-list intrusions in free recall. , 2006, Journal of experimental psychology. Learning, memory, and cognition.

[29]  T. Macfarlane Extracting Semantics from the Enron Corpus , 2013 .

[30]  Adrienne B. Dessel Prejudice in Schools: Promotion of an Inclusive Culture and Climate , 2010 .

[31]  Lawrence W Barsalou,et al.  Simulation, situated conceptualization, and prediction , 2009, Philosophical Transactions of the Royal Society B: Biological Sciences.

[32]  W. Lowe,et al.  Modelling functional priming and the associative boost , 1998 .

[33]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[34]  Brian A. Nosek,et al.  Harvesting implicit group attitudes and beliefs from a demonstration web site , 2002 .

[35]  Kristinn R. Thórisson,et al.  Integrated A.I. systems , 2007, Minds and Machines.

[36]  A. Kiefer,et al.  Implicit stereotypes and women's math performance : How implicit gender-math stereotypes influence women's susceptibility to stereotype threat , 2007 .

[37]  Willard Van Orman Quine,et al.  Word and Object , 1960 .

[38]  A. Greenwald,et al.  Measuring individual differences in implicit cognition: the implicit association test. , 1998, Journal of personality and social psychology.

[39]  W. Jeremy,et al.  Implicit and Explicit Stigmatizing Attitudes and Stereotypes About Depression , 2011 .

[40]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[41]  Braden A. Purcell,et al.  Hierarchical decision processes that operate over distinct timescales underlie choice and changes in strategy , 2016, Proceedings of the National Academy of Sciences.

[42]  W. Marslen-Wilson,et al.  Accessing Different Types of Lexical Semantic Information: Evidence From Priming , 1995 .