A Step Toward More Inclusive People Annotations for Fairness

The Open Images Dataset contains approximately 9 million images and is a widely accepted dataset for computer vision research. As is common practice for large datasets, the annotations are not exhaustive, with bounding boxes and attribute labels for only a subset of the classes in each image. In this paper, we present a new set of annotations on a subset of the Open Images dataset called the MIAP (More Inclusive Annotations for People) subset, containing bounding boxes and attributes for all of the people visible in those images. The attributes and labeling methodology for the MIAP subset were designed to enable research into model fairness. In addition, we analyze the original annotation methodology for the person class and its subclasses, discussing the resulting patterns in order to inform future annotation efforts. By considering both the original and exhaustive annotation sets, researchers can also now study how systematic patterns in training annotations affect modeling.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[3]  Laurens van der Maaten,et al.  Does Object Recognition Work for Everyone? , 2019, CVPR Workshops.

[4]  Yang Song,et al.  Age Progression/Regression by Conditional Adversarial Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  De-ArteagaMaria,et al.  Machine Learning for the Developing World , 2018 .

[6]  D. Sculley,et al.  The Inclusive Images Competition , 2019 .

[7]  Hee Jung Ryu,et al.  InclusiveFaceNet: Improving Face Attribute Detection with Race and Gender Diversity , 2017 .

[8]  Inioluwa Deborah Raji,et al.  Model Cards for Model Reporting , 2018, FAT.

[9]  Gerlof Bouma,et al.  Normalized (pointwise) mutual information in collocation extraction , 2009 .

[10]  Larry S. Davis,et al.  Soft Sampling for Robust Object Detection , 2018, BMVC.

[11]  Nathan Srebro,et al.  Equality of Opportunity in Supervised Learning , 2016, NIPS.

[12]  Toniann Pitassi,et al.  Learning Adversarially Fair and Transferable Representations , 2018, ICML.

[13]  Patrick J. Grother,et al.  Ongoing Face Recognition Vendor Test (FRVT) Part 2: Identification , 2018 .

[14]  Alexander Wong,et al.  Auditing ImageNet: Towards a Model-driven Framework for Annotating Demographic Attributes of Large-Scale Image Datasets , 2019, ArXiv.

[15]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Stacy M. Branham,et al.  HCI Guidelines for Gender Equity and Inclusivity , 2020 .

[17]  Judy Hoffman,et al.  Predictive Inequity in Object Detection , 2019, ArXiv.

[18]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[19]  D. Sculley,et al.  No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World , 2017, 1711.08536.

[20]  Andrew Zisserman,et al.  Turning a Blind Eye: Explicit Removal of Biases and Variation from Deep Neural Network Embeddings , 2018, ECCV Workshops.

[21]  Allison Woodruff,et al.  Putting Fairness Principles into Practice: Challenges, Metrics, and Improvements , 2019, AIES.

[22]  Morgan Klaus Scheuerman,et al.  Gender Recognition or Gender Reductionism?: The Social Implications of Embedded Gender Recognition Systems , 2018, CHI.

[23]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[24]  Fei-Fei Li,et al.  Towards fairer datasets: filtering and balancing the distribution of the people subtree in the ImageNet hierarchy , 2019, FAT*.

[25]  Kimmo Kärkkäinen,et al.  FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age , 2019, ArXiv.