Using Copies to Remove Sensitive Data: A Case Study on Fair Superhero Alignment Prediction

Ensuring classification models are fair with respect to sensitive data attributes is a crucial task when applying machine learning models to real-world problems. Particularly in company production environments, where the decision output by models may have a direct impact on individuals and predictive performance should be maintained over time. In this article, build upon [17], we propose copies as a technique to mitigate the bias of trained algorithms in circumstances where the original data is not accessible and/or the models cannot be re-trained. In particular, we explore a simple methodology to build copies that replicate the learned decision behavior in the absence of sensitive attributes. We validate this methodology in the low-sensitive problem of superhero alignment. We demonstrate that this naive approach to bias reduction is feasible in this problem and argue that copies can be further exploited to embed models with desiderata such as fair learning.

[1]  Christopher T. Lowenkamp,et al.  False Positives, False Negatives, and False Analyses: A Rejoinder to "Machine Bias: There's Software Used across the Country to Predict Future Criminals. and It's Biased against Blacks" , 2016 .

[2]  S. Fullerton,et al.  Genomics is failing on diversity , 2016, Nature.

[3]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[4]  K. Crawford The Hidden Biases in Big Data , 2013 .

[5]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[6]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[7]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[8]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[9]  Saikat Guha,et al.  Challenges in measuring online advertising systems , 2010, IMC '10.

[10]  Jordi Nin,et al.  Copying Machine Learning Classifiers , 2019, IEEE Access.

[11]  Anil K. Jain,et al.  Face Recognition Performance: Role of Demographic Information , 2012, IEEE Transactions on Information Forensics and Security.

[12]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[13]  Jordi Nin,et al.  Towards Global Explanations for Credit Risk Scoring , 2018, ArXiv.

[14]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.