Post-Hoc Methods for Debiasing Neural Networks

As deep learning models become tasked with more and more decisions that impact human lives, such as hiring, criminal recidivism, and loan repayment, bias is becoming a growing concern. This has led to dozens of definitions of fairness and numerous algorithmic techniques to improve the fairness of neural networks. Most debiasing algorithms require retraining a neural network from scratch, however, this is not feasible in many applications, especially when the model takes days to train or when the full training dataset is no longer available. In this work, we present a study on post-hoc methods for debiasing neural networks. First we study the nature of the problem, showing that the difficulty of post-hoc debiasing is highly dependent on the initial conditions of the original model. Then we define three new fine-tuning techniques: random perturbation, layer-wise optimization, and adversarial fine-tuning. All three techniques work for any group fairness constraint. We give a comparison with six algorithms - three popular post-processing debiasing algorithms and our three proposed methods - across three datasets and three popular bias measures. We show that no post-hoc debiasing technique dominates all others, and we identify settings in which each algorithm performs the best. Our code is available at this https URL.

[1]  Aaron Rieke,et al.  Help wanted: an examination of hiring algorithms, equity, and bias , 2018 .

[2]  Julia Stoyanovich,et al.  FairPrep: Promoting Data to a First-Class Citizen in Studies on Fairness-Enhancing Interventions , 2019, EDBT.

[3]  Kush R. Varshney,et al.  Optimized Pre-Processing for Discrimination Prevention , 2017, NIPS.

[4]  Shai Ben-David,et al.  Empirical Risk Minimization under Fairness Constraints , 2018, NeurIPS.

[5]  Kristina Lerman,et al.  A Survey on Bias and Fairness in Machine Learning , 2019, ACM Comput. Surv..

[6]  Jamie Grace Machine Learning Technologies and Their Inherent Human Rights Issues in Criminal Justice Contexts , 2019, SSRN Electronic Journal.

[7]  Xiangliang Zhang,et al.  Decision Theory for Discrimination-Aware Classification , 2012, 2012 IEEE 12th International Conference on Data Mining.

[8]  Michael L. Rich Machine Learning, Automated Suspicion Algorithms, and the Fourth Amendment , 2015 .

[9]  Aaron Roth,et al.  Fairness in Learning: Classic and Contextual Bandits , 2016, NIPS.

[10]  Amitabha Mukerjee,et al.  Multi–objective Evolutionary Algorithms for the Risk–return Trade–off in Bank Loan Management , 2002 .

[11]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[12]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[13]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[14]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[15]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[16]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[17]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Tony Doyle,et al.  Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2017, Inf. Soc..

[20]  Christopher T. Lowenkamp,et al.  False Positives, False Negatives, and False Analyses: A Rejoinder to "Machine Bias: There's Software Used across the Country to Predict Future Criminals. and It's Biased against Blacks" , 2016 .

[21]  Paulo Cortez,et al.  A data-driven approach to predict the success of bank telemarketing , 2014, Decis. Support Syst..

[22]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[23]  Tim Menzies,et al.  Software Engineering for Fairness: A Case Study with Hyperparameter Optimization , 2019, ArXiv.

[24]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[25]  Nisheeth K. Vishnoi,et al.  Classification with Fairness Constraints: A Meta-Algorithm with Provable Guarantees , 2018, FAT.

[26]  Yong Hu,et al.  The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature , 2011, Decis. Support Syst..

[27]  Jon M. Kleinberg,et al.  On Fairness and Calibration , 2017, NIPS.

[28]  Seth Neel,et al.  A Convex Framework for Fair Regression , 2017, ArXiv.

[29]  Luca Oneto,et al.  Fairness in Machine Learning , 2020, INNSBDDL.

[30]  Alexandra Chouldechova,et al.  A snapshot of the frontiers of fairness in machine learning , 2020, Commun. ACM.

[31]  Peter L. Bartlett,et al.  Boosting Algorithms as Gradient Descent , 1999, NIPS.

[32]  Peter I. Frazier,et al.  A Tutorial on Bayesian Optimization , 2018, ArXiv.

[33]  Julia Rubin,et al.  Fairness Definitions Explained , 2018, 2018 IEEE/ACM International Workshop on Software Fairness (FairWare).

[34]  Rachel K. E. Bellamy,et al.  AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias , 2018, ArXiv.

[35]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.