论文信息 - Multi-Differential Fairness Auditor for Black Box Classifiers - 字舞流文

Multi-Differential Fairness Auditor for Black Box Classifiers

Machine learning algorithms are increasingly involved in sensitive decision-making process with adversarial implications on individuals. This paper presents mdfa, an approach that identifies the characteristics of the victims of a classifier's discrimination. We measure discrimination as a violation of multi-differential fairness. Multi-differential fairness is a guarantee that a black box classifier's outcomes do not leak information on the sensitive attributes of a small group of individuals. We reduce the problem of identifying worst-case violations to matching distributions and predicting where sensitive attributes and classifier's outcomes coincide. We apply mdfa to a recidivism risk assessment classifier and demonstrate that individuals identified as African-American with little criminal history are three-times more likely to be considered at high risk of violent recidivism than similar individuals but not African-American.

Huzefa Rangwala | Xavier Gitiaux | H. Rangwala | Xavier Gitiaux

[1] Suresh Venkatasubramanian,et al. A comparative study of fairness-enhancing interventions in machine learning , 2018, FAT.

[2] Guy N. Rothblum,et al. Fairness Through Computationally-Bounded Awareness , 2018, NeurIPS.

[3] Aaron Roth,et al. The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[4] Alexandra Chouldechova,et al. The Frontiers of Fairness in Machine Learning , 2018, ArXiv.

[5] John Langford,et al. A Reductions Approach to Fair Classification , 2018, ICML.

[6] Shafi Goldwasser,et al. Proceedings of the 3rd Innovations in Theoretical Computer Science Conference , 2012 .

[7] Toniann Pitassi,et al. Fairness through awareness , 2011, ITCS '12.

[8] D. Rubin,et al. The central role of the propensity score in observational studies for causal effects , 1983 .

[9] K. Pearson,et al. Biometrika , 1902, The American Naturalist.

[10] Karsten M. Borgwardt,et al. Covariate Shift by Kernel Mean Matching , 2009, NIPS 2009.

[11] Chris Russell,et al. Efficient Search for Diverse Coherent Explanations , 2019, FAT.

[12] James R. Foulds,et al. An Intersectional Definition of Fairness , 2018, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[13] Seth Neel,et al. An Empirical Study of Rich Subgroup Fairness for Machine Learning , 2018, FAT.

[14] Carlos Eduardo Scheidegger,et al. Certifying and Removing Disparate Impact , 2014, KDD.

[15] Krishna P. Gummadi,et al. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment , 2016, WWW.

[16] Matt J. Kusner,et al. Causal Reasoning for Algorithmic Fairness , 2018, ArXiv.

[17] Yishay Mansour,et al. Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[18] Alexandra Chouldechova,et al. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[19] Guy N. Rothblum,et al. Calibration for the (Computationally-Identifiable) Masses , 2017, ArXiv.

[20] Mehryar Mohri,et al. Sample Selection Bias Correction Theory , 2008, ALT.

[21] Uri Shalit,et al. Learning Representations for Counterfactual Inference , 2016, ICML.

[22] Caterina Calsamiglia. Decentralizing Equality of Opportunity , 2009 .

[23] Yang Liu,et al. Actionable Recourse in Linear Classification , 2018, FAT.

[24] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .

[25] Aaron Roth,et al. Differentially Private Fair Learning , 2018, ICML.

[26] Seth Neel,et al. Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness , 2017, ICML.