Augmented Fairness: An Interpretable Model Augmenting Decision-Makers' Fairness

We propose a model-agnostic approach for mitigating the prediction bias of a black-box decision-maker, and in particular, a human decision-maker. Our method detects in the feature space where the black-box decision-maker is biased and replaces it with a few short decision rules, acting as a "fair surrogate". The rule-based surrogate model is trained under two objectives, predictive performance and fairness. Our model focuses on a setting that is common in practice but distinct from other literature on fairness. We only have black-box access to the model, and only a limited set of true labels can be queried under a budget constraint. We formulate a multi-objective optimization for building a surrogate model, where we simultaneously optimize for both predictive performance and bias. To train the model, we propose a novel training algorithm that combines a nondominated sorting genetic algorithm with active learning. We test our model on public datasets where we simulate various biased "black-box" classifiers (decision-makers) and apply our approach for interpretable augmented fairness.

[1]  Kristina Lerman,et al.  A Survey on Bias and Fairness in Machine Learning , 2019, ACM Comput. Surv..

[2]  Christian Blum,et al.  Metaheuristics in combinatorial optimization: Overview and conceptual comparison , 2003, CSUR.

[3]  Seth Neel,et al.  A Convex Framework for Fair Regression , 2017, ArXiv.

[4]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[5]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[6]  Cynthia Rudin,et al.  A Bayesian Framework for Learning Rule Sets for Interpretable Classification , 2017, J. Mach. Learn. Res..

[7]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[8]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[9]  Naoki Abe,et al.  Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[10]  Rachel K. E. Bellamy,et al.  AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias , 2018, ArXiv.

[11]  Kush R. Varshney,et al.  Bias Mitigation Post-processing for Individual and Group Fairness , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[13]  James Y. Zou,et al.  Multiaccuracy: Black-Box Post-Processing for Fairness in Classification , 2018, AIES.

[14]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[15]  Toon Calders,et al.  Data preprocessing techniques for classification without discrimination , 2011, Knowledge and Information Systems.

[16]  Toniann Pitassi,et al.  Learning Adversarially Fair and Transferable Representations , 2018, ICML.

[17]  Berk Ustun,et al.  Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions , 2019, ICML.

[18]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[19]  Tong Wang,et al.  Gaining Free or Low-Cost Interpretability with Interpretable Partial Substitute , 2019, ICML.

[20]  Tong Wang,et al.  Hybrid Predictive Model: When an Interpretable Model Collaborates with a Black-box Model , 2019, ArXiv.

[21]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[22]  Alex Pentland,et al.  Active Fairness in Algorithmic Decision Making , 2018, AIES.