Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions

When the performance of a machine learning model varies over groups defined by sensitive attributes (e.g., gender or ethnicity), the performance disparity can be expressed in terms of the probability distributions of the input and output variables over each group. In this paper, we exploit this fact to reduce the disparate impact of a fixed classification model over a population of interest. Given a black-box classifier, we aim to eliminate the performance gap by perturbing the distribution of input variables for the disadvantaged group. We refer to the perturbed distribution as a counterfactual distribution, and characterize its properties for common fairness criteria. We introduce a descent algorithm to learn a counterfactual distribution from data. We then discuss how the estimated distribution can be used to build a data preprocessor that can reduce disparate impact without training a new model. We validate our approach through experiments on real-world datasets, showing that it can repair different forms of disparity without a significant drop in accuracy.

[1]  Sharad Goel,et al.  Fast Threshold Tests for Detecting Discrimination , 2017, AISTATS.

[2]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[3]  Yair Zick,et al.  Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[4]  E. Ordentlich,et al.  Inequalities for the L1 Deviation of the Empirical Distribution , 2003 .

[5]  I. Rahwan,et al.  Why we need to audit algorithms , 2018 .

[6]  Matt Fredrikson,et al.  Proxy Discrimination∗ in Data-Driven Systems Theory and Experiments with Machine Learnt Programs , 2017 .

[7]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[8]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[9]  Oluwasanmi Koyejo,et al.  Examples are not enough, learn to criticize! Criticism for Interpretability , 2016, NIPS.

[10]  Venkat Anantharam,et al.  On Maximal Correlation, Hypercontractivity, and the Data Processing Inequality studied by Erkip and Cover , 2013, ArXiv.

[11]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[12]  David Sontag,et al.  Why Is My Classifier Discriminatory? , 2018, NeurIPS.

[13]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[14]  Judea Pearl,et al.  Counterfactual Probabilities: Computational Methods, Bounds and Applications , 1994, UAI.

[15]  Jean-Michel Loubes,et al.  Obtaining Fairness using Optimal Transport Theory , 2018, ICML.

[16]  Zhe Zhang,et al.  Identifying Significant Predictive Bias in Classifiers , 2016, ArXiv.

[17]  Suresh Venkatasubramanian,et al.  Auditing black-box models for indirect influence , 2016, Knowledge and Information Systems.

[18]  Yann Brenier,et al.  A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem , 2000, Numerische Mathematik.

[19]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[20]  Alexandra Chouldechova,et al.  Does mitigating ML's impact disparity require treatment disparity? , 2017, NeurIPS.

[21]  Shao-Lun Huang,et al.  Efficient statistics: Extracting information from IID observations , 2014, 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  R. Tibshirani,et al.  Prototype selection for interpretable classification , 2011, 1202.5933.

[23]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[24]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[25]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[26]  Adam Tauman Kalai,et al.  Decoupled Classifiers for Group-Fair and Efficient Machine Learning , 2017, FAT.

[27]  J. Stock Nonparametric Policy Analysis , 1989 .

[28]  Krishna P. Gummadi,et al.  From Parity to Preference-based Notions of Fairness in Classification , 2017, NIPS.

[29]  Sharad Goel,et al.  The Problem of Infra-Marginality in Outcome Tests for Discrimination , 2016, 1607.05376.

[30]  Alexandra Chouldechova,et al.  Fairer and more accurate, but for whom? , 2017, ArXiv.

[31]  Erez Shmueli,et al.  Algorithmic Fairness , 2020, ArXiv.

[32]  Kristian Lum,et al.  An algorithm for removing sensitive information: Application to race-independent recidivism prediction , 2017, The Annals of Applied Statistics.

[33]  Hao Wang,et al.  Avoiding Disparate Impact with Counterfactual Distributions , 2018 .

[34]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[35]  David C. Parkes,et al.  Fairness without Harm: Decoupled Classifiers with Preference Guarantees , 2019, ICML.

[36]  John Langford,et al.  A Reductions Approach to Fair Classification , 2018, ICML.

[37]  Indre Zliobaite,et al.  Measuring discrimination in algorithmic decision making , 2017, Data Mining and Knowledge Discovery.

[38]  Sandra Wachter,et al.  A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI , 2018 .

[39]  Steven Haker,et al.  Minimizing Flows for the Monge-Kantorovich Problem , 2003, SIAM J. Math. Anal..

[40]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[41]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[42]  Thomas Lemieux,et al.  Labor Market Institutions and the Distribution of Wages, 1973-1992: A Semiparametric Approach , 1995 .

[43]  Ran Canetti,et al.  From Soft Classifiers to Hard Decisions: How fair can we be? , 2018, FAT.

[44]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[45]  Aditya Krishna Menon,et al.  The cost of fairness in binary classification , 2018, FAT.

[46]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[47]  D. Rubin Causal Inference Using Potential Outcomes , 2005 .

[48]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[49]  V. Chernozhukov,et al.  Inference on Counterfactual Distributions , 2009, 0904.0951.

[50]  Yuriy Brun,et al.  Fairness testing: testing software for discrimination , 2017, ESEC/SIGSOFT FSE.

[51]  Indre Zliobaite,et al.  Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models , 2016, Artificial Intelligence and Law.

[52]  Salvatore Ruggieri,et al.  A multidisciplinary survey on discrimination analysis , 2013, The Knowledge Engineering Review.

[53]  Aaron Fisher,et al.  Visually Communicating and Teaching Intuition for Influence Functions , 2018, The American Statistician.

[54]  Bernhard Schölkopf,et al.  Avoiding Discrimination through Causal Reasoning , 2017, NIPS.

[55]  Indrăź źLiobaităź,et al.  Measuring discrimination in algorithmic decision making , 2017 .

[56]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[57]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[58]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[59]  Lizhong Zheng,et al.  Euclidean Information Theory , 2008, 2008 IEEE International Zurich Seminar on Communications.

[60]  Uri Shalit,et al.  Learning Representations for Counterfactual Inference , 2016, ICML.

[61]  Hao Wang,et al.  On the Direction of Discrimination: An Information-Theoretic Analysis of Disparate Impact in Machine Learning , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[62]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[63]  Yang Liu,et al.  Actionable Recourse in Linear Classification , 2018, FAT.