Inherent Tradeoffs in Learning Fair Representation

With the prevalence of machine learning in high-stakes applications, especially the ones regulated by anti-discrimination laws or societal norms, it is crucial to ensure that the predictive models do not propagate any existing bias or discrimination. Due to the ability of deep neural nets to learn rich representations, recent advances in algorithmic fairness have focused on learning fair representations with adversarial techniques to reduce bias in data while preserving utility simultaneously. In this paper, through the lens of information theory, we provide the first result that quantitatively characterizes the tradeoff between demographic parity and the joint utility across different population groups. Specifically, when the base rates differ between groups, we show that any method aiming to learn fair representation admits an information-theoretic lower bound on the joint error across these groups. To complement our negative results, we also prove that if the optimal decision functions across different groups are close, then learning fair representation leads to an alternative notion of fairness, known as the accuracy parity, which states that the error rates are close between groups. Our theoretical findings are also confirmed empirically on real-world datasets. We believe our insights contribute to better understanding of the tradeoff between utility and different notions of fairness.

[1]  Zhe Zhao,et al.  Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations , 2017, ArXiv.

[2]  Jun Sakuma,et al.  Fairness-aware Learning through Regularization Approach , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[3]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[4]  Toon Calders,et al.  Classifying without discriminating , 2009, 2009 2nd International Conference on Computer, Control and Communication.

[5]  T. Aaron Gulliver,et al.  Confliction of the Convexity and Metric Properties in f-Divergences , 2007, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[6]  Kun Zhang,et al.  On Learning Invariant Representation for Domain Adaptation , 2019, ArXiv.

[7]  Max Welling,et al.  The Variational Fair Autoencoder , 2015, ICLR.

[8]  Kristian Lum,et al.  A statistical framework for fair predictive algorithms , 2016, ArXiv.

[9]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[10]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[11]  Igor Vajda,et al.  On Divergences and Informations in Statistics and Information Theory , 2006, IEEE Transactions on Information Theory.

[12]  M. Kearns,et al.  Fairness in Criminal Justice Risk Assessments: The State of the Art , 2017, Sociological Methods & Research.

[13]  Toon Calders,et al.  Three naive Bayes approaches for discrimination-free classification , 2010, Data Mining and Knowledge Discovery.

[14]  Amos J. Storkey,et al.  Censoring Representations with an Adversary , 2015, ICLR.

[15]  Toon Calders,et al.  Building Classifiers with Independency Constraints , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[16]  Indre Zliobaite,et al.  On the relation between accuracy and fairness in binary classification , 2015, ArXiv.

[17]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[18]  Mehryar Mohri,et al.  Sample Selection Bias Correction Theory , 2008, ALT.

[19]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[20]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[21]  Toniann Pitassi,et al.  Learning Adversarially Fair and Transferable Representations , 2018, ICML.

[22]  Stefano Ermon,et al.  Learning Controllable Fair Representations , 2018, AISTATS.

[23]  Kristian Lum,et al.  An algorithm for removing sensitive information: Application to race-independent recidivism prediction , 2017, The Annals of Applied Statistics.

[24]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[25]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[26]  Zoubin Ghahramani,et al.  One-Network Adversarial Fairness , 2019, AAAI.

[27]  Jihun Hamm,et al.  Minimax Filter: Learning to Preserve Privacy from Inference Attacks , 2016, J. Mach. Learn. Res..

[28]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[29]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[30]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[31]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[32]  Jun Sakuma,et al.  Fairness-Aware Classifier with Prejudice Remover Regularizer , 2012, ECML/PKDD.

[33]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[34]  Yishay Mansour,et al.  Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[35]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[36]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.