Characterizing Bias in Classifiers using Generative Models

Models that are learned from real-world data are often biased because the data used to train them is biased. This can propagate systemic human biases that exist and ultimately lead to inequitable treatment of people, especially minorities. To characterize bias in learned classifiers, existing approaches rely on human oracles labeling real-world examples to identify the "blind spots" of the classifiers; these are ultimately limited due to the human labor required and the finite nature of existing image examples. We propose a simulation-based approach for interrogating classifiers using generative adversarial models in a systematic manner. We incorporate a progressive conditional generative model for synthesizing photo-realistic facial images and Bayesian Optimization for an efficient interrogation of independent facial image classification systems. We show how this approach can be used to efficiently characterize racial and gender biases in commercial systems.

[1]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[2]  Daniel McDuff,et al.  Identifying Bias in AI using Simulation , 2018, ArXiv.

[3]  Yuxiao Hu,et al.  MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[4]  Anil K. Jain,et al.  Face Recognition Performance: Role of Demographic Information , 2012, IEEE Transactions on Information Forensics and Security.

[5]  Antonio M. López,et al.  Virtual and Real World Adaptation for Pedestrian Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Robert M. Haralick Performance Characterization in Computer Vision , 1992, BMVC.

[7]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[8]  Dan Boneh,et al.  The Space of Transferable Adversarial Examples , 2017, ArXiv.

[9]  Eric Horvitz,et al.  Discovering Blind Spots of Predictive Models: Representations and Policies for Guided Exploration , 2016, ArXiv.

[10]  Logan Engstrom,et al.  Synthesizing Robust Adversarial Examples , 2017, ICML.

[11]  Dhruv Batra,et al.  LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation , 2016, ICLR.

[12]  Alan L. Yuille,et al.  UnrealCV: Connecting Computer Vision to Unreal Engine , 2016, ECCV Workshops.

[13]  Peter I. Frazier,et al.  A Tutorial on Bayesian Optimization , 2018, ArXiv.

[14]  T. Fitzpatrick The validity and practicality of sun-reactive skin types I through VI. , 1988, Archives of dermatology.

[15]  Visvanathan Ramesh,et al.  Model Validation for Vision Systems via Graphics Simulation , 2015, ArXiv.

[16]  Eric Horvitz,et al.  Identifying Unknown Unknowns in the Open World: Representations and Policies for Guided Exploration , 2016, AAAI.

[17]  Hee Jung Ryu,et al.  InclusiveFaceNet: Improving Face Attribute Detection with Race and Gender Diversity , 2017 .

[18]  Ricardo Baeza-Yates,et al.  Data and algorithmic bias in the web , 2016, WebSci.

[19]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[20]  Daniel S. Weld,et al.  A Coverage-Based Utility Model for Identifying Unknown Unknowns , 2018, AAAI.

[21]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[22]  Judy Hoffman,et al.  Predictive Inequity in Object Detection , 2019, ArXiv.

[23]  Zhe Gan,et al.  AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[25]  Elias Bareinboim,et al.  Causal inference and the data-fusion problem , 2016, Proceedings of the National Academy of Sciences.

[26]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[27]  Margaret Mitchell,et al.  Improving Smiling Detection with Race and Gender Diversity , 2017, ArXiv.

[28]  Ashish Khetan,et al.  Robustness of Conditional GANs to Noisy Labels , 2018, NeurIPS.

[29]  Francesco Bonchi,et al.  Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining , 2016, KDD.

[30]  Anil K. Jain,et al.  Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[33]  Bernhard Egger,et al.  Analyzing and Reducing the Damage of Dataset Bias to Face Recognition With Synthetic Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34]  Luc Van Gool,et al.  DEX: Deep EXpectation of Apparent Age from a Single Image , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[35]  Anil K. Jain,et al.  Learning Face Age Progression: A Pyramid Architecture of GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Yike Guo,et al.  Unsupervised Image-to-Image Translation with Generative Adversarial Networks , 2017, ArXiv.

[37]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[38]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[39]  Visvanathan Ramesh,et al.  Simulations for Validation of Vision Systems , 2015, ArXiv.

[40]  Visvanathan Ramesh,et al.  Model-driven Simulations for Deep Convolutional Neural Networks , 2016, ArXiv.

[41]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Joy Buolamwini Gender shades : intersectional phenotypic and demographic evaluation of face datasets and gender classifiers , 2017 .

[43]  Inioluwa Deborah Raji,et al.  Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products , 2019, AIES.

[44]  Alex Pentland,et al.  Fair, Transparent, and Accountable Algorithmic Decision-making Processes , 2017, Philosophy & Technology.

[45]  Bernhard Egger,et al.  Empirically Analyzing the Effect of Dataset Biases on Deep Face Recognition Systems , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[46]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[47]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[48]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).