论文信息 - Augmented Fairness: An Interpretable Model Augmenting Decision-Makers' Fairness

Augmented Fairness: An Interpretable Model Augmenting Decision-Makers' Fairness

We propose a model-agnostic approach for mitigating the prediction bias of a black-box decision-maker, and in particular, a human decision-maker. Our method detects in the feature space where the black-box decision-maker is biased and replaces it with a few short decision rules, acting as a "fair surrogate". The rule-based surrogate model is trained under two objectives, predictive performance and fairness. Our model focuses on a setting that is common in practice but distinct from other literature on fairness. We only have black-box access to the model, and only a limited set of true labels can be queried under a budget constraint. We formulate a multi-objective optimization for building a surrogate model, where we simultaneously optimize for both predictive performance and bias. To train the model, we propose a novel training algorithm that combines a nondominated sorting genetic algorithm with active learning. We test our model on public datasets where we simulate various biased "black-box" classifiers (decision-makers) and apply our approach for interpretable augmented fairness.

Tong Wang | Maytal Saar-Tsechansky | M. Saar-Tsechansky | Tong Wang

[1] Kristina Lerman,et al. A Survey on Bias and Fairness in Machine Learning , 2019, ACM Comput. Surv..

[2] Christian Blum,et al. Metaheuristics in combinatorial optimization: Overview and conceptual comparison , 2003, CSUR.

[3] Seth Neel,et al. A Convex Framework for Fair Regression , 2017, ArXiv.

[4] Kush R. Varshney,et al. Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[5] Nathan Srebro,et al. Equality of Opportunity in Supervised Learning , 2016, NIPS.

[6] Cynthia Rudin,et al. A Bayesian Framework for Learning Rule Sets for Interpretable Classification , 2017, J. Mach. Learn. Res..

[7] Toniann Pitassi,et al. Learning Fair Representations , 2013, ICML.

[8] Ron Kohavi,et al. Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[9] Naoki Abe,et al. Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[10] Rachel K. E. Bellamy,et al. AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias , 2018, ArXiv.

[11] Kush R. Varshney,et al. Bias Mitigation Post-processing for Individual and Group Fairness , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12] Krishna P. Gummadi,et al. Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[13] James Y. Zou,et al. Multiaccuracy: Black-Box Post-Processing for Fairness in Classification , 2018, AIES.

[14] Kalyanmoy Deb,et al. A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[15] Toon Calders,et al. Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[16] Toniann Pitassi,et al. Learning Adversarially Fair and Transferable Representations , 2018, ICML.

[17] Berk Ustun,et al. Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions , 2019, ICML.

[18] Cynthia Rudin,et al. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[19] Tong Wang,et al. Gaining Free or Low-Cost Interpretability with Interpretable Partial Substitute , 2019, ICML.

[20] Tong Wang,et al. Hybrid Predictive Model: When an Interpretable Model Collaborates with a Black-box Model , 2019, ArXiv.

[21] Jon M. Kleinberg,et al. On Fairness and Calibration , 2017, NIPS.

[22] Alex Pentland,et al. Active Fairness in Algorithmic Decision Making , 2018, AIES.