Towards Reducing Bias in Gender Classification

Societal bias towards certain communities is a big problem that affects a lot of machine learning systems. This work aims at addressing the racial bias present in many modern gender recognition systems. We learn race invariant representations of human faces with an adversarially trained autoencoder model. We show that such representations help us achieve less biased performance in gender classification. We use variance in classification accuracy across different races as a surrogate for the racial bias of the model and achieve a drop of over 40% in variance with race invariant representations.