Joint Transfer of Model Knowledge and Fairness Over Domains Using Wasserstein Distance

Owing to the increasing use of machine learning in our daily lives, the problem of fairness has recently become an important topic in machine learning societies. Recent studies regarding fairness in machine learning have been conducted to attempt to ensure statistical independence between individual model predictions and designated sensitive attributes. However, in reality, cases exist in which the sensitive variables of data used for learning models differ from the data upon which the model is applied. In this paper, we investigate a methodology for developing a fair classification model for data with limited or no labels, by transferring knowledge from another data domain where information is fully available. This is done by controlling the Wasserstein distances between relevant distributions. Subsequently, we obtain a fair model that could be successfully applied to two datasets with different sensitive attributes. We present theoretical results validating that our approach provably transfers both classification performance and fairness over domains. Experimental results show that our method does indeed promote fairness for the target domain, while retaining reasonable classification accuracy, and that it often outperforms comparative models in terms of joint fairness.

[1]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[3]  Chen-Yu Lee,et al.  Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Han Zhao,et al.  Conditional Learning of Fair Representations , 2019, ICLR.

[5]  Woojin Lee,et al.  Learning of indiscriminate distributions of document embeddings for domain adaptation , 2019, Intell. Data Anal..

[6]  Lei Song,et al.  Unsupervised Domain Adaptation by Mapped Correlation Alignment , 2018, IEEE Access.

[7]  Jean-Michel Loubes,et al.  Obtaining Fairness using Optimal Transport Theory , 2018, ICML.

[8]  Woojin Lee,et al.  Instance Weighting Domain Adaptation Using Distance Kernel , 2018 .

[9]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[10]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[11]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[12]  Laurent Risser,et al.  Using Wasserstein-2 regularization to ensure fair decisions with Neural-Network classifiers , 2019, ArXiv.

[13]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[14]  Philip S. Yu,et al.  Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  Ed H. Chi,et al.  Transfer of Machine Learning Fairness across Domains , 2019, ArXiv.

[16]  Klaus-Robert Müller,et al.  Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[17]  Toniann Pitassi,et al.  Learning Adversarially Fair and Transferable Representations , 2018, ICML.

[18]  C. Villani Optimal Transport: Old and New , 2008 .

[19]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[20]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[21]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[22]  Kush R. Varshney,et al.  Fair Transfer Learning with Missing Protected Attributes , 2019, AIES.

[23]  Zhiheng Li,et al.  A Deep Transfer Model With Wasserstein Distance Guided Multi-Adversarial Networks for Bearing Fault Diagnosis Under Different Working Conditions , 2019, IEEE Access.

[24]  Luca Oneto,et al.  Fairness in Machine Learning , 2020, INNSBDDL.

[25]  Silvia Chiappa,et al.  Wasserstein Fair Classification , 2019, UAI.

[26]  L. Kantorovich On the Translocation of Masses , 2006 .

[27]  Jian Shen,et al.  Wasserstein Distance Guided Representation Learning for Domain Adaptation , 2017, AAAI.

[28]  Julien Rabin,et al.  Wasserstein Barycenter and Its Application to Texture Mixing , 2011, SSVM.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Jing Gao,et al.  On handling negative transfer and imbalanced distributions in multiple source transfer learning , 2014, SDM.

[31]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[32]  Shai Ben-David,et al.  Empirical Risk Minimization under Fairness Constraints , 2018, NeurIPS.

[33]  Hanrui Wu,et al.  Informative Feature Selection for Domain Adaptation , 2019, IEEE Access.

[34]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..