FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age

Existing public face datasets are strongly biased toward Caucasian faces, and other races (e.g., Latino) are significantly underrepresented. This can lead to inconsistent model accuracy, limit the applicability of face analytic systems to non-White race groups, and adversely affect research findings based on such skewed data. To mitigate the race bias in these datasets, we construct a novel face image dataset, containing 108,501 images, with an emphasis of balanced race composition in the dataset. We define 7 race groups: White, Black, Indian, East Asian, Southeast Asian, Middle East, and Latino. Images were collected from the YFCC-100M Flickr dataset and labeled with race, gender, and age groups. Evaluations were performed on existing face attribute datasets as well as novel image datasets to measure generalization performance. We find that the model trained from our dataset is substantially more accurate on novel datasets and the accuracy is consistent between race and gender groups.

[1]  Luc Van Gool,et al.  Deep Expectation of Real and Apparent Age from a Single Image Without Facial Landmarks , 2016, International Journal of Computer Vision.

[2]  Mahadev Satyanarayanan,et al.  OpenFace: A general-purpose face recognition library with mobile applications , 2016 .

[3]  Júlio Cesar dos Reis,et al.  Demographics of News Sharing in the U.S. Twittersphere , 2017, HT.

[4]  John R. Smith,et al.  Diversity in Faces , 2019, ArXiv.

[5]  David A. Shamma,et al.  YFCC100M , 2015, Commun. ACM.

[6]  Kimmo Kärkkäinen,et al.  Gender Slopes: Counterfactual Fairness for Computer Vision Models by Attribute Manipulation , 2020, Proceedings of the 2nd International Workshop on Fairness, Accountability, Transparency and Ethics in Multimedia.

[7]  Wen Gao,et al.  Multi-Task Learning with Low Rank Attribute Embedding for Multi-Camera Person Re-Identification , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Sergio Escalera,et al.  ChaLearn Looking at People and Faces of the World: Face AnalysisWorkshop and Challenge 2016 , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[10]  Krishna P. Gummadi,et al.  Fairness Constraints: Mechanisms for Fair Classification , 2015, AISTATS.

[11]  Karl Ricanek,et al.  MORPH: a longitudinal image database of normal adult age-progression , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[12]  Jean-Luc Dugelay,et al.  Face aging with conditional generative adversarial networks , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[13]  Junmo Kim,et al.  Learning Not to Learn: Training Deep Neural Networks With Biased Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[15]  Blake Lemoine,et al.  Mitigating Unwanted Biases with Adversarial Learning , 2018, AIES.

[16]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Jungseock Joo,et al.  Protest Activity Detection and Perceived Violence Estimation from Social Media Images , 2017, ACM Multimedia.

[18]  Shaogang Gong,et al.  Person Re-identification by Attributes , 2012, BMVC.

[19]  Song-Chun Zhu,et al.  Automated Facial Trait Judgment and Election Outcome Prediction: Social Dimensions of Face , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  D. Sculley,et al.  No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World , 2017, 1711.08536.

[21]  Andrew Zisserman,et al.  Turning a Blind Eye: Explicit Removal of Biases and Variation from Deep Neural Network Embeddings , 2018, ECCV Workshops.

[22]  Peiyun Hu,et al.  Finding Tiny Faces , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Antitza Dantcheva,et al.  Mitigating Bias in Gender, Age and Ethnicity Classification: A Multi-task Convolution Neural Network Approach , 2018, ECCV Workshops.

[24]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Stefan Winkler,et al.  A data-driven approach to cleaning large face datasets , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[26]  Shuicheng Yan,et al.  Clothing Attributes Assisted Person Reidentification , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[27]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Louis-Philippe Morency,et al.  OpenFace 2.0: Facial Behavior Analysis Toolkit , 2018, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[29]  Xiaogang Wang,et al.  Hybrid Deep Learning for Face Verification , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Daniel McDuff,et al.  Characterizing Bias in Classifiers using Generative Models , 2019, NeurIPS.

[31]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[32]  Yuxiao Hu,et al.  MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition , 2016, ECCV.

[33]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[34]  M. Kearns,et al.  Fairness in Criminal Justice Risk Assessments: The State of the Art , 2017, Sociological Methods & Research.

[35]  Julia Rubin,et al.  Fairness Definitions Explained , 2018, 2018 IEEE/ACM International Workshop on Software Fairness (FairWare).

[36]  Shuo Yang,et al.  WIDER FACE: A Face Detection Benchmark , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Mislav Grgic,et al.  SCface – surveillance cameras face database , 2011, Multimedia Tools and Applications.

[38]  Shiguang Shan,et al.  Heterogeneous Face Attribute Estimation: A Deep Multi-Task Learning Approach , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Guillaume Lample,et al.  Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.

[41]  Zachary C. Steinert-Threlkeld Twitter as Data , 2018 .

[42]  Yang Feng,et al.  How Polarized Have We Become? A Multimodal Classification of Trump Followers and Clinton Followers , 2017, SocInfo.

[43]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[44]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[46]  M. Wilkes,et al.  Fitzpatrick Skin Type, Individual Typology Angle, and Melanin Index in an African Population: Steps Toward Universally Applicable Skin Photosensitivity Assessments. , 2015, JAMA dermatology.

[47]  Ira Kemelmacher-Shlizerman,et al.  The MegaFace Benchmark: 1 Million Faces for Recognition at Scale , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Inioluwa Deborah Raji,et al.  Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products , 2019, AIES.

[49]  Hee Jung Ryu,et al.  InclusiveFaceNet: Improving Face Attribute Detection with Race and Gender Diversity , 2017 .

[50]  Jungseock Joo,et al.  Understanding the Political Ideology of Legislators from Social Media Images , 2019, ICWSM.

[51]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[53]  Shiguang Shan,et al.  Arbitrary Facial Attribute Editing: Only Change What You Want , 2017, ArXiv.

[54]  Saif Mohammad,et al.  Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems , 2018, *SEMEVAL.

[55]  Gang Hua,et al.  CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[56]  Adriana Kovashka,et al.  Persuasive Faces: Generating Faces in Advertisements , 2018, BMVC.

[57]  R. Schaefer,et al.  Encyclopedia of race, ethnicity, and society , 2008 .

[58]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[59]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[60]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[61]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Krishna P. Gummadi,et al.  Who Makes Trends? Understanding Demographic Biases in Crowdsourced Recommendations , 2017, ICWSM.

[63]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[64]  James Zou,et al.  AI can be sexist and racist — it’s time to make it fair , 2018, Nature.

[65]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[66]  Song-Chun Zhu,et al.  Human Attribute Recognition by Rich Appearance Dictionary , 2013, 2013 IEEE International Conference on Computer Vision.

[67]  Davis E. King Max-Margin Object Detection , 2015, ArXiv.

[68]  Omkar M. Parkhi,et al.  VGGFace2: A Dataset for Recognising Faces across Pose and Age , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[69]  Shree K. Nayar,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Describable Visual Attributes for Face Verification and Image Search , 2022 .

[70]  Mei Wang,et al.  Racial Faces in-the-Wild: Reducing Racial Bias by Deep Unsupervised Domain Adaptation , 2018, ArXiv.

[71]  Chu-Song Chen,et al.  Face Recognition and Retrieval Using Cross-Age Reference Coding With Cross-Age Celebrity Dataset , 2015, IEEE Transactions on Multimedia.

[72]  Trevor Darrell,et al.  Women also Snowboard: Overcoming Bias in Captioning Models , 2018, ECCV.

[73]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[74]  Yang Song,et al.  Age Progression/Regression by Conditional Adversarial Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[75]  Harini Suresh,et al.  Learning Tasks for Multitask Learning: Heterogenous Patient Populations in the ICU , 2018, KDD.

[76]  Julian Fiérrez,et al.  SensitiveNets: Learning Agnostic Representations with Application to Face Recognition , 2019, ArXiv.

[77]  Xiaoou Tang,et al.  Learning Social Relation Traits from Face Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[78]  Moustapha Cissé,et al.  ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases , 2017, ECCV.