Adversarial Removal of Gender from Deep Image Representations

In this work we analyze visual recognition tasks such as object and action recognition, and demonstrate the extent to which these tasks are correlated with features corresponding to a protected variable such as gender. We introduce the concept of natural leakage to measure the intrinsic reliance of a task on a protected variable. We further show that machine learning models of visual recognition trained for these tasks tend to exacerbate the reliance on gender features. To address this, we use adversarial training to remove unwanted features corresponding to protected variables from intermediate representations in a deep neural network. Experiments on two datasets: the COCO dataset (objects), and the imSitu dataset (actions), show reductions in the extent to which models rely on gender features while maintaining most of the accuracy of the original models. These results even surpass a strong baseline that blurs or removes people from images using ground-truth annotations. Moreover, we provide convincing interpretable visual evidence through an autoencoder-augmented model showing that this approach is performing semantically meaningful removal of gender features, and thus can also be used to remove gender attributes directly from images.

[1]  Guillermo Sapiro,et al.  Learning to Succeed while Teaching to Fail: Privacy in Closed Machine Learning Systems , 2017, ArXiv.

[2]  Yu Liu,et al.  Exploring Disentangled Feature Representation Beyond Face Identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[4]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[5]  Ali Farhadi,et al.  Situation Recognition: Visual Semantic Role Labeling for Image Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Joshua B. Tenenbaum,et al.  Separating Style and Content with Bilinear Models , 2000, Neural Computation.

[7]  Sean A. Munson,et al.  Unequal Representation and Gender Stereotypes in Image Search Results for Occupations , 2015, CHI.

[8]  Trevor Darrell,et al.  Women also Snowboard: Overcoming Bias in Captioning Models , 2018, ECCV.

[9]  Jieyu Zhao,et al.  Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods , 2018, NAACL.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Yoav Goldberg,et al.  Adversarial Removal of Demographic Attributes from Text Data , 2018, EMNLP.

[12]  Zhenyu Wu,et al.  Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study , 2018, ECCV.

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[15]  Percy Liang,et al.  Fairness Without Demographics in Repeated Loss Minimization , 2018, ICML.

[16]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[17]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[18]  Andreas Dengel,et al.  What do Deep Networks Like to See? , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[20]  Matt J. Kusner,et al.  Counterfactual Fairness , 2017, NIPS.

[21]  Moustapha Cissé,et al.  ConvNets and ImageNet Beyond Accuracy: Explanations, Bias Detection, Adversarial Examples and Model Criticism , 2017, ArXiv.

[22]  Rachel Rudinger,et al.  Gender Bias in Coreference Resolution , 2018, NAACL.

[23]  Jonghyun Choi,et al.  Training with the Invisibles: Obfuscating Images to Share Safely for Learning Visual Recognition Models , 2019, ArXiv.

[24]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[25]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[26]  Zhe Zhao,et al.  Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations , 2017, ArXiv.

[27]  Graham Neubig,et al.  Controllable Invariance through Adversarial Feature Learning , 2017, NIPS.

[28]  Pascal Vincent,et al.  Disentangling Factors of Variation for Facial Expression Recognition , 2012, ECCV.

[29]  Cynthia Carter,et al.  Women and news: A long and winding road , 2011 .

[30]  Ross B. Girshick,et al.  Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[32]  Jieyu Zhao,et al.  Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints , 2017, EMNLP.

[33]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[34]  Zeyu Li,et al.  Learning Gender-Neutral Word Embeddings , 2018, EMNLP.

[35]  John Langford,et al.  A Reductions Approach to Fair Classification , 2018, ICML.

[36]  Jane You,et al.  Adaptive Deep Metric Learning for Identity-Aware Facial Expression Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[37]  Bert Huang,et al.  Beyond Parity: Fairness Objectives for Collaborative Filtering , 2017, NIPS.

[38]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[39]  Judy Hoffman,et al.  Predictive Inequity in Object Detection , 2019, ArXiv.

[40]  Margaret Mitchell,et al.  Improving Smiling Detection with Race and Gender Diversity , 2017, ArXiv.

[41]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[42]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.