论文信息 - Detecting Bias with Generative Counterfactual Face Attribute Augmentation

Detecting Bias with Generative Counterfactual Face Attribute Augmentation

We introduce a simple framework for identifying biases of a smiling attribute classifier. Our method poses counterfactual questions of the form: how would the prediction change if this face characteristic had been different? We leverage recent advances in generative adversarial networks to build a realistic generative model of face images that affords controlled manipulation of specific image characteristics. We introduce a set of metrics that measure the effect of manipulating a specific property of an image on the output of a trained classifier. Empirically, we identify several different factors of variation that affect the predictions of a smiling classifier trained on CelebA.

[1] Yarin Gal,et al. Real Time Image Saliency for Black Box Classifiers , 2017, NIPS.

[2] Lalana Kagal,et al. Explaining Explanations: An Overview of Interpretability of Machine Learning , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[3] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[4] Ankur Taly,et al. Counterfactual Fairness in Text Classification through Robustness , 2018, AIES.

[5] Issa Kohler-Hausmann. Eddie Murphy and the Dangers of Counterfactual Causal Thinking About Detecting Racial Discrimination , 2019 .

[6] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[7] Andrea Vedaldi,et al. Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8] Jieyu Zhao,et al. Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints , 2017, EMNLP.

[9] David Duvenaud,et al. Explaining Image Classifiers by Counterfactual Generation , 2018, ICLR.

[10] Judy Hoffman,et al. Predictive Inequity in Object Detection , 2019, ArXiv.

[11] Inioluwa Deborah Raji,et al. Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products , 2019, AIES.

[12] T. Fitzpatrick. The validity and practicality of sun-reactive skin types I through VI. , 1988, Archives of dermatology.

[13] Inioluwa Deborah Raji,et al. Model Cards for Model Reporting , 2018, FAT.

[14] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[15] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.

[16] Trevor Darrell,et al. Women also Snowboard: Overcoming Bias in Captioning Models , 2018, ECCV.

[17] Os Keyes,et al. The Misgendering Machines , 2018, Proc. ACM Hum. Comput. Interact..

[18] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[19] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[20] Bernhard Schölkopf,et al. Avoiding Discrimination through Causal Reasoning , 2017, NIPS.

[21] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[22] Martin Wattenberg,et al. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[23] Matt J. Kusner,et al. Counterfactual Fairness , 2017, NIPS.