On the Direction of Discrimination: An Information-Theoretic Analysis of Disparate Impact in Machine Learning

In the context of machine learning, disparate impact refers to a form of systematic discrimination whereby the output distribution of a model depends on the value of a sensitive attribute (e.g., race or gender). In this paper, we propose an information-theoretic framework to analyze the disparate impact of a binary classification model. We view the model as a fixed channel, and quantify disparate impact as the divergence in output distributions over two groups. Our aim is to find a correction function that can perturb the input distributions of each group to align their output distributions. We present an optimization problem that can be solved to obtain a correction function that will make the output distributions statistically indistinguishable. We derive closed-form expressions to efficiently compute the correction function, and demonstrate the benefits of our framework on a recidivism prediction problem based on the ProPublica COMPAS dataset.

[1]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[2]  Toon Calders,et al.  Classifying without discriminating , 2009, 2009 2nd International Conference on Computer, Control and Communication.

[3]  Suresh Venkatasubramanian,et al.  Auditing black-box models for indirect influence , 2016, Knowledge and Information Systems.

[4]  Lalana Kagal,et al.  Iterative Orthogonal Feature Projection for Diagnosing Bias in Black-Box Models , 2016, ArXiv.

[5]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[6]  Vincent Y. F. Tan,et al.  Hypothesis testing in the high privacy limit , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  Venkat Anantharam,et al.  Evaluation of Marton's Inner Bound for the General Broadcast Channel , 2009, IEEE Transactions on Information Theory.

[8]  Venkat Anantharam,et al.  On Maximal Correlation, Hypercontractivity, and the Data Processing Inequality studied by Erkip and Cover , 2013, ArXiv.

[9]  Richard E. Blahut,et al.  Hypothesis testing and information theory , 1974, IEEE Trans. Inf. Theory.

[10]  Ertem Tuncel,et al.  On error exponents in hypothesis testing , 2005, IEEE Transactions on Information Theory.

[11]  Reza Modarres,et al.  Measures of Dependence , 2011, International Encyclopedia of Statistical Science.

[12]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[13]  D. Ensign,et al.  Decision making with limited feedback : Error bounds for recidivism prediction and predictive policing , 2017 .

[14]  Harriett Tee Taggart,et al.  Redlining , 1981 .

[15]  Bernhard Schölkopf,et al.  Avoiding Discrimination through Causal Reasoning , 2017, NIPS.

[16]  Salvatore Ruggieri,et al.  A multidisciplinary survey on discrimination analysis , 2013, The Knowledge Engineering Review.

[17]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[18]  Latanya Sweeney,et al.  Discrimination in online ad delivery , 2013, CACM.

[19]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[20]  Lizhong Zheng,et al.  Euclidean Information Theory , 2008, 2008 IEEE International Zurich Seminar on Communications.

[21]  Sharad Goel,et al.  The Problem of Infra-Marginality in Outcome Tests for Discrimination , 2016, 1607.05376.

[22]  Francesco Bonchi,et al.  Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining , 2016, KDD.