Residual Unfairness in Fair Machine Learning from Prejudiced Data

Recent work in fairness in machine learning has proposed adjusting for fairness by equalizing accuracy metrics across groups and has also studied how datasets affected by historical prejudices may lead to unfair decision policies. We connect these lines of work and study the residual unfairness that arises when a fairness-adjusted predictor is not actually fair on the target population due to systematic censoring of training data by existing biased policies. This scenario is particularly common in the same applications where fairness is a concern. We characterize theoretically the impact of such censoring on standard fairness metrics for binary classifiers and provide criteria for when residual unfairness may or may not appear. We prove that, under certain conditions, fairness-adjusted classifiers will in fact induce residual unfairness that perpetuates the same injustices, against the same groups, that biased the data to begin with, thus showing that even state-of-the-art fair machine learning can have a "bias in, bias out" property. When certain benchmark data is available, we show how sample reweighting can estimate and adjust fairness metrics while accounting for censoring. We use this to study the case of Stop, Question, and Frisk (SQF) and demonstrate that attempting to adjust for fairness perpetuates the same injustices that the policy is infamous for.

[1]  Jure Leskovec,et al.  The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables , 2017, KDD.

[2]  Bianca Zadrozny,et al.  Learning and evaluating classifiers under sample selection bias , 2004, ICML.

[3]  Joaquin Quiñonero Candela,et al.  Counterfactual reasoning and learning systems: the example of computational advertising , 2012, J. Mach. Learn. Res..

[4]  Nathan Kallus,et al.  Recursive Partitioning for Personalization using Observational Data , 2016, ICML.

[5]  Bernhard Schölkopf,et al.  Avoiding Discrimination through Causal Reasoning , 2017, NIPS.

[6]  Suresh Venkatasubramanian,et al.  Runaway Feedback Loops in Predictive Policing , 2017, FAT.

[7]  A. Gelman,et al.  An Analysis of the New York City Police Department's “Stop-and-Frisk” Policy in the Context of Claims of Racial Bias , 2007 .

[8]  D. Freedman,et al.  Weighting Regressions by Propensity Scores , 2008, Evaluation review.

[9]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[10]  Peter C. Fishburn,et al.  Stochastic Dominance and Moments of Distributions , 1980, Math. Oper. Res..

[11]  W. Greene,et al.  A Statistical Model for Credit Scoring , 1992 .

[12]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[13]  Thorsten Joachims,et al.  Counterfactual Risk Minimization: Learning from Logged Bandit Feedback , 2015, ICML.

[14]  Alexandra Chouldechova,et al.  A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions , 2018, FAT.

[15]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[16]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[17]  Lisa Rice,et al.  Discriminatory Effects of Credit Scoring on Communities of Color , 2013 .

[18]  Justin M. Rao,et al.  Precinct or Prejudice? Understanding Racial Disparities in New York City's Stop-and-Frisk Policy , 2016 .