Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing

A trade-off between accuracy and fairness is almost taken as a given in the existing literature on fairness in machine learning. Yet, it is not preordained that accuracy should decrease with increased fairness. Novel to this work, we examine fair classification through the lens of mismatched hypothesis testing: trying to find a classifier that distinguishes between two ideal distributions when given two mismatched distributions that are biased. Using Chernoff information, a tool in information theory, we theoretically demonstrate that, contrary to popular belief, there always exist ideal distributions such that optimal fairness and accuracy (with respect to the ideal distributions) are achieved simultaneously: there is no trade-off. Moreover, the same classifier yields the lack of a trade-off with respect to ideal distributions while yielding a trade-off when accuracy is measured with respect to the given (possibly biased) dataset. To complement our main result, we formulate an optimization to find ideal distributions and derive fundamental limits to explain why a trade-off exists on the given biased dataset. We also derive conditions under which active data collection can alleviate the fairness-accuracy trade-off in the real world. Our results lead us to contend that it is problematic to measure accuracy with respect to data that reflects bias, and instead, we should be considering accuracy with respect to ideal, unbiased data.

[1]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[2]  Shai Ben-David,et al.  Empirical Risk Minimization under Fairness Constraints , 2018, NeurIPS.

[3]  Han Zhao,et al.  Inherent Tradeoffs in Learning Fair Representation , 2019, NeurIPS.

[4]  Warren J. Gross,et al.  A Chernoff-type Lower Bound for the Gaussian Q-function , 2012 .

[5]  Bernhard Schölkopf,et al.  Avoiding Discrimination through Causal Reasoning , 2017, NIPS.

[6]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[7]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[8]  AmirEmad Ghassami,et al.  Fairness in Supervised Learning: An Information Theoretic Approach , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[9]  Joachim M. Buhmann,et al.  The Balanced Accuracy and Its Posterior Distribution , 2010, 2010 20th International Conference on Pattern Recognition.

[10]  Krishna P. Gummadi,et al.  Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[11]  Ofir Nachum,et al.  Identifying and Correcting Label Bias in Machine Learning , 2019, AISTATS.

[12]  Russ Bubley,et al.  Randomized algorithms , 1995, CSUR.

[13]  Kush R. Varshney,et al.  Data Pre-Processing for Discrimination Prevention: Information-Theoretic Optimization and Analysis , 2018, IEEE Journal of Selected Topics in Signal Processing.

[14]  Suresh Venkatasubramanian,et al.  On the (im)possibility of fairness , 2016, ArXiv.

[15]  John Langford,et al.  A Reductions Approach to Fair Classification , 2018, ICML.

[16]  J. Lafferty,et al.  Sparse additive models , 2007, 0711.4555.

[17]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[18]  Youngchul Sung,et al.  Generalized Chernoff Information for Mismatched Bayesian Detection and Its Application to Energy Detection , 2012, IEEE Signal Processing Letters.

[19]  T. Kailath The Divergence and Bhattacharyya Distance Measures in Signal Selection , 1967 .

[20]  Prasanna Sattigeri,et al.  On Fairness in Budget-Constrained Decision Making , 2019 .

[21]  Gilles Blanchard,et al.  Classification with Asymmetric Label Noise: Consistency and Maximal Denoising , 2013, COLT.

[22]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[23]  Yunfeng Zhang,et al.  Data Augmentation for Discrimination Prevention and Bias Disambiguation , 2020, AIES.

[24]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[25]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[26]  Omer Reingold,et al.  Tracking and Improving Information in the Service of Fairness , 2019, EC.

[27]  Daniel Berend,et al.  A finite sample analysis of the Naive Bayes classifier , 2015, J. Mach. Learn. Res..

[28]  Michael Carl Tschantz,et al.  Discriminative but Not Discriminatory: A Comparison of Fairness Definitions under Different Worldviews , 2018, ArXiv.

[29]  Jean-Baptiste Tristan,et al.  Unlocking Fairness: a Trade-off Revisited , 2019, NeurIPS.

[30]  Pramod K. Varshney,et al.  Why Interpretability in Machine Learning? An Answer Using Distributed Detection and Data Fusion Theory , 2018, ArXiv.

[31]  Alfred O. Hero,et al.  Empirically Estimable Classification Bounds Based on a Nonparametric Divergence Measure , 2014, IEEE Transactions on Signal Processing.

[32]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[33]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[34]  Nisheeth K. Vishnoi,et al.  Classification with Fairness Constraints: A Meta-Algorithm with Provable Guarantees , 2018, FAT.

[35]  David Sontag,et al.  Why Is My Classifier Discriminatory? , 2018, NeurIPS.

[36]  Alex Pentland,et al.  Active Fairness in Algorithmic Decision Making , 2018, AIES.

[37]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[38]  Adam Tauman Kalai,et al.  Decoupled Classifiers for Group-Fair and Efficient Machine Learning , 2017, FAT.

[39]  Aditya Krishna Menon,et al.  The cost of fairness in binary classification , 2018, FAT.