A minimax framework for quantifying risk-fairness trade-off in regression

We propose a theoretical framework for the problem of learning a real-valued function which meets fairness requirements. This framework is built upon the notion of $\alpha$-relative (fairness) improvement of the regression function which we introduce using the theory of optimal transport. Setting $\alpha = 0$ corresponds to the regression problem under the Demographic Parity constraint, while $\alpha = 1$ corresponds to the classical regression problem without any constraints. For $\alpha \in (0, 1)$ the proposed framework allows to continuously interpolate between these two extreme cases and to study partially fair predictors. Within this framework we precisely quantify the cost in risk induced by the introduction of the fairness constraint. We put forward a statistical minimax setup and derive a general problem-dependent lower bound on the risk of any estimator satisfying $\alpha$-relative improvement constraint. We illustrate our framework on a model of linear regression with Gaussian design and systematic group-dependent bias, deriving matching (up to absolute constants) upper and lower bounds on the minimax risk under the introduced constraint. Finally, we perform a simulation study of the latter setup.

[1]  E. Gilbert A comparison of signalling alphabets , 1952 .

[2]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[3]  W. Gangbo,et al.  Optimal maps for the multidimensional Monge-Kantorovich problem , 1998 .

[4]  Arkadi Nemirovski,et al.  Topics in Non-Parametric Statistics , 2000 .

[5]  P. Massart,et al.  Adaptive estimation of a quadratic functional by model selection , 2000 .

[6]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[7]  C. Villani Topics in Optimal Transportation , 2003 .

[8]  Alexandre B. Tsybakov,et al.  Optimal Rates of Aggregation , 2003, COLT.

[9]  Olivier Catoni,et al.  Statistical learning theory and stochastic optimization , 2004 .

[10]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[11]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[12]  Benoît R. Kloeckner A geometric study of Wasserstein spaces: Euclidean spaces , 2008, 0804.3505.

[13]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[14]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[15]  N. I. Pentacaput Optimal exponential bounds on the accuracy of classification , 2011 .

[16]  Jean-Yves Audibert,et al.  Robust linear least squares regression , 2010, 1010.0074.

[17]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[18]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[19]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[20]  Sham M. Kakade,et al.  Random Design Analysis of Ridge Regression , 2012, COLT.

[21]  Dimitris Bertsimas,et al.  On the Efficiency-Fairness Trade-off , 2012, Manag. Sci..

[22]  Toon Calders,et al.  Controlling Attribute Effect in Linear Regression , 2013, 2013 IEEE 13th International Conference on Data Mining.

[23]  Larry Wasserman,et al.  Distribution‐free prediction bands for non‐parametric regression , 2014 .

[24]  Mario Köppen,et al.  Evolving Fair Linear Regression for the Representation of Human-Drawn Regression Lines , 2014, 2014 International Conference on Intelligent Networking and Collaborative Systems.

[25]  F. Santambrogio Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling , 2015 .

[26]  Thibaut Le Gouic,et al.  Existence and consistency of Wasserstein barycenters , 2015, Probability Theory and Related Fields.

[27]  Indre Zliobaite,et al.  On the relation between accuracy and fairness in binary classification , 2015, ArXiv.

[28]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[29]  Junpei Komiyama,et al.  Two-stage Algorithm for Fairness-aware Machine Learning , 2017, ArXiv.

[30]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[31]  P. Bellec Optimal exponential bounds for aggregation of density estimators , 2014, 1405.3907.

[32]  Valero Laparra,et al.  Fair Kernel Learning , 2017, ECML/PKDD.

[33]  Novi Quadrianto,et al.  Recycling Privileged Learning and Distribution Matching for Fairness , 2017, NIPS.

[34]  Seth Neel,et al.  A Convex Framework for Fair Regression , 2017, ArXiv.

[35]  Shai Ben-David,et al.  Empirical Risk Minimization under Fairness Constraints , 2018, NeurIPS.

[36]  Akiko Takeda,et al.  Nonconvex Optimization for Regression with Fairness Constraints , 2018, ICML.

[37]  John Langford,et al.  A Reductions Approach to Fair Classification , 2018, ICML.

[38]  Stephen J. Roberts,et al.  Equality Constrained Decision Trees: For the Algorithmic Enforcement of Group Fairness , 2018, ArXiv.

[39]  Alexandra Chouldechova,et al.  Does mitigating ML's impact disparity require treatment disparity? , 2017, NeurIPS.

[40]  Toniann Pitassi,et al.  Learning Adversarially Fair and Transferable Representations , 2018, ICML.

[41]  Chao Gao,et al.  Robust covariance and scatter matrix estimation under Huber’s contamination model , 2015, The Annals of Statistics.

[42]  Steven Mills,et al.  Fair Forests: Regularized Tree Induction to Minimize Model Bias , 2017, AIES.

[43]  Alessandro Rinaldo,et al.  Distribution-Free Predictive Inference for Regression , 2016, Journal of the American Statistical Association.

[44]  Luca Oneto,et al.  Learning Fair and Transferable Representations , 2019, ArXiv.

[45]  Stephen Roberts,et al.  A General Framework for Fair Regression , 2018, Entropy.

[46]  Noureddine El Karoui,et al.  Fairness-Aware Learning for Continuous Attributes and Treatments , 2019, ICML.

[47]  Miroslav Dudík,et al.  Fair Regression: Quantitative Definitions and Reduction-based Algorithms , 2019, ICML.

[48]  Jean-Baptiste Tristan,et al.  Unlocking Fairness: a Trade-off Revisited , 2019, NeurIPS.

[49]  Luca Oneto,et al.  Fairness in Machine Learning , 2020, INNSBDDL.

[50]  S. Bobkov,et al.  One-dimensional empirical measures, order statistics, and Kantorovich transport distances , 2019, Memoirs of the American Mathematical Society.

[51]  Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices , 2019, 1912.10754.

[52]  Meisam Razaviyayn,et al.  R\'enyi Fair Inference , 2019, 1906.12005.

[53]  Sherri Rose,et al.  Fair regression for health care spending , 2019, Biometrics.

[54]  Christian Haas,et al.  The Price of Fairness - A Framework to Explore Trade-Offs in Algorithmic Fairness , 2019, International Conference on Interaction Sciences.

[55]  Jean-Michel Loubes,et al.  Obtaining Fairness using Optimal Transport Theory , 2018, ICML.

[56]  Nicolai Meinshausen,et al.  Fair Data Adaptation with Quantile Preservation , 2019, ArXiv.

[57]  Silvia Chiappa,et al.  Wasserstein Fair Classification , 2019, UAI.

[58]  Meisam Razaviyayn,et al.  Rényi Fair Inference , 2019, ICLR.

[59]  Jean-Michel Loubes,et al.  Review of Mathematical frameworks for Fairness in Machine Learning , 2020, ArXiv.

[60]  Luca Oneto,et al.  Fair regression via plug-in estimator and recalibration with statistical guarantees , 2020, NeurIPS.

[61]  Luca Oneto,et al.  Fair Regression with Wasserstein Barycenters , 2020, NeurIPS.

[62]  Jean-Michel Loubes,et al.  Projection to Fairness in Statistical Learning. , 2020 .

[63]  Chiappa Silvia,et al.  A General Approach to Fairness with Optimal Transport , 2020, AAAI.

[64]  Luca Oneto,et al.  General Fair Empirical Risk Minimization , 2019, 2020 International Joint Conference on Neural Networks (IJCNN).

[65]  Matt Olfat,et al.  Covariance-Robust Dynamic Watermarking , 2020, 2020 59th IEEE Conference on Decision and Control (CDC).

[66]  D. Steinberg,et al.  Fairness Measures for Regression via Probabilistic Classification , 2020, ArXiv.

[67]  The limits of distribution-free conditional predictive inference , 2019, Information and Inference: A Journal of the IMA.

[68]  John Aslanides,et al.  A General Approach to Fairness with Optimal Transport , 2020, AAAI.

[69]  Simon O'Callaghan,et al.  Fast Fair Regression via Efficient Approximations of Mutual Information , 2020, ArXiv.

[70]  Kristina Lerman,et al.  A Survey on Bias and Fairness in Machine Learning , 2019, ACM Comput. Surv..