Fair Data Representation for Machine Learning at the Pareto Frontier

As machine learning powered decision making is playing an increasingly important role in our daily lives, it is imperative to strive for fairness of the underlying data processing and algorithms. We propose a pre-processing algorithm for fair data representation via which Lobjective supervised learning algorithms result in an estimation of the Pareto frontier between prediction error and statistical disparity. In particular, the present work applies the optimal positive definite affine transport maps to approach the post-processing Wasserstein barycenter characterization of the optimal fair L-objective supervised learning via a pre-processing data deformation. We call the resulting data Wasserstein pseudo-barycenter. Furthermore, we show that the Wasserstein geodesics from the learning outcome marginals to the barycenter characterizes the Pareto frontier between L-loss and total Wasserstein distance among learning outcome marginals. Thereby, an application of McCann interpolation generalizes the pseudo-barycenter to a family of data representations via which L-objective supervised learning algorithms result in the Pareto frontier. Numerical simulations underscore the advantages of the proposed data representation: (1) the pre-processing step is compositive with arbitrary L-objective supervised learning methods and unseen data; (2) the fair representation protects data privacy by preventing the training machine from direct or indirect access to the sensitive information of the data; (3) the optimal affine map results in efficient computation of fair supervised learning on high-dimensional data; (4) experimental results shed light on the fairness of L-objective unsupervised learning via the proposed fair data representation.

[1]  Luca Oneto,et al.  Fair Regression with Wasserstein Barycenters , 2020, NeurIPS.

[2]  Josep Domingo-Ferrer,et al.  A Methodology for Direct and Indirect Discrimination Prevention in Data Mining , 2013, IEEE Transactions on Knowledge and Data Engineering.

[3]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[4]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[5]  Silvia Chiappa,et al.  Wasserstein Fair Classification , 2019, UAI.

[6]  Toon Calders,et al.  Why Unbiased Computational Processes Can Lead to Discriminative Decision Procedures , 2013, Discrimination and Privacy in the Information Society.

[7]  J. A. Cuesta-Albertos,et al.  A fixed-point approach to barycenters in Wasserstein space , 2015, 1511.05355.

[8]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[9]  C. Villani Optimal Transport: Old and New , 2008 .

[10]  Salvatore Ruggieri,et al.  Using t-closeness anonymity to control for non-discrimination , 2015, Trans. Data Priv..

[11]  Brendan Pass Optimal transportation with infinitely many marginals , 2012, 1206.5515.

[12]  I. Ekeland Existence, uniqueness and efficiency of equilibrium in hedonic markets with multidimensional types , 2010 .

[13]  Alok Baveja,et al.  Computing , Artificial Intelligence and Information Technology A data-driven software tool for enabling cooperative information sharing among police departments , 2002 .

[14]  Y. Brenier Polar Factorization and Monotone Rearrangement of Vector-Valued Functions , 1991 .

[15]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[16]  G. Carlier,et al.  Matching for teams , 2010 .

[17]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[18]  Brendan Pass,et al.  Wasserstein Barycenters over Riemannian manifolds , 2014, 1412.7726.

[19]  Seth Neel,et al.  A Convex Framework for Fair Regression , 2017, ArXiv.

[20]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[21]  D. Karlen The Supreme Court of the United States , 1962 .

[22]  C. Villani Topics in Optimal Transportation , 2003 .

[23]  R. Bhatia Positive Definite Matrices , 2007 .

[24]  G. Burton TOPICS IN OPTIMAL TRANSPORTATION (Graduate Studies in Mathematics 58) By CÉDRIC VILLANI: 370 pp., US$59.00, ISBN 0-8218-3312-X (American Mathematical Society, Providence, RI, 2003) , 2004 .

[25]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[26]  J. A. Cuesta-Albertos,et al.  On lower bounds for theL2-Wasserstein metric in a Hilbert space , 1996 .

[27]  Jean-Michel Loubes,et al.  Projection to Fairness in Statistical Learning. , 2020 .

[28]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[29]  Esteban G. Tabak,et al.  Explanation of Variability and Removal of Confounding Factors from Data through Optimal Transport , 2018 .

[30]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..