Statistical discrimination in learning agents

Undesired bias afflicts both human and algorithmic decision making, and may be especially prevalent when information processing trade-offs incentivize the use of heuristics. One primary example is statistical discrimination—selecting social partners based not on their underlying attributes, but on readily perceptible characteristics that covary with their suitability for the task at hand. We present a theoretical model to examine how information processing influences statistical discrimination and test its predictions using multi-agent reinforcement learning with various agent architectures in a partner choice-based social dilemma. As predicted, statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture. All agents showed substantial statistical discrimination, defaulting to using the readily available correlates instead of the outcome relevant features. We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias. However, all agent algorithms we tried still exhibited substantial bias after learning in biased training populations.

[1]  B. Skyrms The Stag Hunt , 2001 .

[2]  James N. Baron,et al.  Men and Women at Work: Sex Segregation and Statistical Discrimination , 1986, American Journal of Sociology.

[3]  Esther Hauk,et al.  Leaving the Prison: Permitting Partner Choice and Refusal in Prisoner's Dilemma Games , 2001 .

[4]  Mirco Musolesi,et al.  Partner Selection for the Emergence of Cooperation in Multi-Agent Systems Using Reinforcement Learning , 2020, AAAI.

[5]  P. Dayan,et al.  Decision theory, reinforcement learning, and the brain , 2008, Cognitive, affective & behavioral neuroscience.

[6]  M. Nowak,et al.  Evolutionary games and spatial chaos , 1992, Nature.

[7]  Toniann Pitassi,et al.  Causal Modeling for Fairness in Dynamical Systems , 2019, ICML.

[8]  J Tudor-Hart,et al.  On the nature of prejudice. , 1961, The Eugenics review.

[9]  P. Devine Stereotypes and prejudice: Their automatic and controlled components. , 1989 .

[10]  M. Macy,et al.  Learning dynamics in social dilemmas , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Denis Fize,et al.  Speed of processing in the human visual system , 1996, Nature.

[12]  Alexander Peysakhovich,et al.  Learning Existing Social Conventions via Observationally Augmented Self-Play , 2018, AIES.

[13]  Julia Rubin,et al.  Fairness Definitions Explained , 2018, 2018 IEEE/ACM International Workshop on Software Fairness (FairWare).

[14]  Alexander D'Amour,et al.  Fairness is not static: deeper understanding of long term fairness via simulation studies , 2020, FAT*.

[15]  T. Schelling Hockey Helmets, Concealed Weapons, and Daylight Saving , 1973 .

[16]  Sarah F. Brosnan,et al.  A proximate perspective on reciprocal altruism , 2002, Human nature.

[17]  Filippo Aureli,et al.  Primate reciprocity and its cognitive requirements , 2010 .

[18]  Jonathan Guryan,et al.  Taste�?Based or Statistical Discrimination: The Economics of Discrimination Returns to its Roots , 2013 .

[19]  Susan T. Fiske,et al.  Group Entitativity and Social Attribution: On Translating Situational Constraints into Stereotypes , 1998 .

[20]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[21]  M. Nowak Five Rules for the Evolution of Cooperation , 2006, Science.

[22]  Robert van Rooij,et al.  The Stag Hunt and the Evolution of Social Structure , 2007, Stud Logica.

[23]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[24]  V. Lamme,et al.  The distinct modes of vision offered by feedforward and recurrent processing , 2000, Trends in Neurosciences.

[25]  Joel Z. Leibo,et al.  Prefrontal cortex as a meta-reinforcement learning system , 2018, bioRxiv.

[26]  Solon Barocas,et al.  Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions , 2018, 1811.07867.

[27]  E. Phelps The Statistical Theory of Racism and Sexism , 1972 .

[28]  Brian A. Nosek,et al.  Reducing implicit racial preferences: II. Intervention effectiveness across time. , 2016, Journal of experimental psychology. General.

[29]  Mingyan Liu,et al.  How Do Fair Decisions Fare in Long-term Qualification? , 2020, NeurIPS.

[30]  Holly Bridge,et al.  A review of neuroimaging studies of race-related prejudice: does amygdala response reflect threat? , 2014, Front. Hum. Neurosci..

[31]  Stephen Maitzen,et al.  The Ethics of Statistical Discrimination , 1991 .

[32]  H. Simon,et al.  Models Of Man : Social And Rational , 1957 .

[33]  Joel Z. Leibo,et al.  A multi-agent reinforcement learning model of common-pool resource appropriation , 2017, NIPS.

[34]  Peter Dayan,et al.  Goal-directed control and its antipodes , 2009, Neural Networks.

[35]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[36]  E. Paluck,et al.  The Contact Hypothesis Re-evaluated , 2017, Behavioural Public Policy.

[37]  Peter Dayan,et al.  Simple Plans or Sophisticated Habits? State, Transition and Learning Interactions in the Two-Step Task , 2015, bioRxiv.

[38]  I. Ayres Fair Driving: Gender and Race Discrimination in Retail Car Negotiations , 1991 .

[39]  Kenneth J. Arrow,et al.  What Has Economics to Say about Racial Discrimination , 1998 .

[40]  P. Hammerstein,et al.  Biological markets: supply and demand determine the effect of partner choice in cooperation, mutualism and mating , 1994, Behavioral Ecology and Sociobiology.

[41]  Pat Barclay,et al.  Partner choice creates competitive altruism in humans , 2007, Proceedings of the Royal Society B: Biological Sciences.

[42]  Francisco C. Santos,et al.  Cooperation Prevails When Individuals Adjust Their Social Ties , 2006, PLoS Comput. Biol..

[43]  A. Rubinstein Response time and decision making: An experimental study , 2013, Judgment and Decision Making.

[44]  Alexander Peysakhovich,et al.  Maintaining cooperation in complex social dilemmas using deep reinforcement learning , 2017, ArXiv.

[45]  Aaron Roth,et al.  Fairness in Learning: Classic and Contextual Bandits , 2016, NIPS.

[46]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[47]  Mark D. Smucker,et al.  Iterated Prisoner's Dilemma with Choice and Refusal of Partners: Evolutionary Results , 1995, ECAL.

[48]  Joel Z. Leibo,et al.  Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.

[49]  J. Henrich The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter , 2015 .

[50]  Julie A. Shah,et al.  Fairness in Multi-Agent Sequential Decision-Making , 2014, NIPS.

[51]  Alexandra Chouldechova,et al.  A snapshot of the frontiers of fairness in machine learning , 2020, Commun. ACM.

[52]  Angeliki Lazaridou,et al.  Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning , 2020, ACL.

[53]  G. Schino,et al.  Reciprocal Altruism in Primates: Partner Choice, Cognition, and Emotions , 2009 .

[54]  Wojciech Czarnecki,et al.  Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.

[55]  R. Kurzban,et al.  Evolutionary origins of stigmatization: the functions of social exclusion. , 2001, Psychological bulletin.

[56]  John Tooby,et al.  The Evolution of War and its Cognitive Foundations , 2006 .

[57]  Alexander Peysakhovich,et al.  Towards AI that Can Solve Social Dilemmas , 2018, AAAI Spring Symposia.

[58]  Joshua B. Tenenbaum,et al.  Coordinate to cooperate or compete: Abstract goals and joint intentions in social interaction , 2016, CogSci.

[59]  W. Hamilton,et al.  The evolution of cooperation. , 1984, Science.

[60]  J. Henrich,et al.  Markets, Religion, Community Size, and the Evolution of Fairness and Punishment , 2010, Science.

[61]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[62]  Alexandre Lacoste,et al.  Quantifying the Carbon Emissions of Machine Learning , 2019, ArXiv.

[63]  M. Banaji,et al.  Implicit social cognition: attitudes, self-esteem, and stereotypes. , 1995, Psychological review.

[64]  Stewart J. Schwab Is Statistical Discrimination Efficient , 1986 .

[65]  Joel Z. Leibo,et al.  Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences , 2020, ArXiv.

[66]  R. Shepard,et al.  Mental Rotation of Three-Dimensional Objects , 1971, Science.

[67]  Douglas P. Fry,et al.  Anthropological Aspects of Ostracism , 2016 .

[68]  Marcello Gallucci,et al.  Avoiding the Social Death Penalty: Ostracism and Cooperation in Social Dilemmas. , 2005 .

[69]  Daniel Balliet,et al.  Ingroup favoritism in cooperation: a meta-analysis. , 2014, Psychological bulletin.

[70]  Esther Rolf,et al.  Delayed Impact of Fair Machine Learning , 2018, ICML.

[71]  Ann Nowé,et al.  Evolutionary game theory and multi-agent reinforcement learning , 2005, The Knowledge Engineering Review.

[72]  Danielle Li,et al.  Hiring as Exploration , 2020, SSRN Electronic Journal.

[73]  F. Zuiderveen Borgesius,et al.  Discrimination, artificial intelligence, and algorithmic decision-making , 2018 .

[74]  Aaron Roth,et al.  Fairness in Reinforcement Learning , 2016, ICML.