An example of prediction which complies with Demographic Parity and equalizes group-wise risks in the context of regression

Let $(X, S, Y) \in \mathbb{R}^p \times \{1, 2\} \times \mathbb{R}$ be a triplet following some joint distribution $\mathbb{P}$ with feature vector $X$, sensitive attribute $S$ , and target variable $Y$. The Bayes optimal prediction $f^*$ which does not produce Disparate Treatment is defined as $f^*(x) = \mathbb{E}[Y | X = x]$. We provide a non-trivial example of a prediction $x \to f(x)$ which satisfies two common group-fairness notions: Demographic Parity \begin{align} (f(X) | S = 1) &\stackrel{d}{=} (f(X) | S = 2) \end{align} and Equal Group-Wise Risks \begin{align} \mathbb{E}[(f^*(X) - f(X))^2 | S = 1] = \mathbb{E}[(f^*(X) - f(X))^2 | S = 2]. \end{align} To the best of our knowledge this is the first explicit construction of a non-constant predictor satisfying the above. We discuss several implications of this result on better understanding of mathematical notions of algorithmic fairness.

[1]  Filippo Santambrogio,et al.  Optimal Transport for Applied Mathematicians , 2015 .

[2]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[3]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[4]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[5]  Alexandra Chouldechova,et al.  Does mitigating ML's impact disparity require treatment disparity? , 2017, NeurIPS.

[6]  F. Santambrogio Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling , 2015 .

[7]  Kristian Lum,et al.  A statistical framework for fair predictive algorithms , 2016, ArXiv.

[8]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[9]  Silvio Lattanzi,et al.  Fair Clustering Through Fairlets , 2018, NIPS.

[10]  Pratik Gajane,et al.  On formalizing fairness in prediction with machine learning , 2017, ArXiv.

[11]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[12]  Luca Oneto,et al.  Fair Regression with Wasserstein Barycenters , 2020, NeurIPS.

[13]  Evgenii Chzhen,et al.  A minimax framework for quantifying risk-fairness trade-off in regression , 2020, The Annals of Statistics.

[14]  Chiappa Silvia,et al.  A General Approach to Fairness with Optimal Transport , 2020, AAAI.

[15]  C. Villani Topics in Optimal Transportation , 2003 .

[16]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[17]  Adam Tauman Kalai,et al.  Decoupled Classifiers for Group-Fair and Efficient Machine Learning , 2017, FAT.

[18]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[19]  Shai Ben-David,et al.  Empirical Risk Minimization under Fairness Constraints , 2018, NeurIPS.

[20]  Richard A. Primus Equal Protection and Disparate Impact: Round Three , 2003 .

[21]  Miroslav Dudík,et al.  Fair Regression: Quantitative Definitions and Reduction-based Algorithms , 2019, ICML.

[22]  Jean-Michel Loubes,et al.  Projection to Fairness in Statistical Learning. , 2020 .

[23]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[24]  Toon Calders,et al.  Controlling Attribute Effect in Linear Regression , 2013, 2013 IEEE 13th International Conference on Data Mining.

[25]  Bernhard Schölkopf,et al.  Avoiding Discrimination through Causal Reasoning , 2017, NIPS.

[26]  John Aslanides,et al.  A General Approach to Fairness with Optimal Transport , 2020, AAAI.

[27]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.