A Survey on Bias in Visual Datasets

Computer Vision (CV) has achieved remarkable results, outperforming humans in several tasks. Nonetheless, it may result in major discrimination if not dealt with proper care. CV systems highly depend on the data they are fed with and can learn and amplify biases within such data. Thus, both the problems of understanding and discovering biases are of utmost importance. Yet, to date there is no comprehensive survey on bias in visual datasets. To this end, this work aims to: i) describe the biases that can affect visual datasets; ii) review the literature on methods for bias discovery and quantification in visual datasets; iii) discuss existing attempts to collect bias-aware visual datasets. A key conclusion of our study is that the problem of bias discovery and quantification in visual datasets is still open and there is room for improvement in terms of both methods and the range of biases that can be addressed; moreover, there is no such thing as a bias-free dataset, so scientists and practitioners must become aware of the biases in their datasets and make them explicit. To this end, we propose a checklist that can be used to spot different types of bias during visual dataset collection.

[1]  Emily Denton,et al.  Towards a critical race methodology in algorithmic fairness , 2019, FAT*.

[2]  John R. Smith,et al.  Diversity in Faces , 2019, ArXiv.

[3]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[4]  Robert M. Entman,et al.  Framing: Toward Clarification of a Fractured Paradigm , 1993 .

[5]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[6]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[7]  Jake Goldenfein,et al.  The Profiling Potential of Computer Vision and the Challenge of Computational Empiricism , 2019, FAT.

[8]  Tal Hassner,et al.  Age and Gender Estimation of Unfiltered Faces , 2014, IEEE Transactions on Information Forensics and Security.

[9]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[10]  Markus Strohmaier,et al.  Inferring Gender from Names on the Web: A Comparative Evaluation of Gender Detection Methods , 2016, WWW.

[11]  Albert Gordo,et al.  Towards Measuring Fairness in AI: The Casual Conversations Dataset , 2021, IEEE Transactions on Biometrics, Behavior, and Identity Science.

[12]  Trevor Darrell,et al.  BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling , 2018, ArXiv.

[13]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[14]  Steffen Staab,et al.  Bias in data‐driven artificial intelligence systems—An introductory survey , 2020, WIREs Data Mining Knowl. Discov..

[15]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[16]  Jieyu Zhao,et al.  Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Richard van Noorden The ethical questions that haunt facial-recognition research , 2020, Nature.

[18]  Yisroel Mirsky,et al.  The Creation and Detection of Deepfakes , 2020, ACM Comput. Surv..

[19]  Yong Jae Lee,et al.  Don’t Judge an Object by Its Context: Learning to Overcome Contextual Bias , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Christian Wachinger,et al.  Detect and Correct Bias in Multi-Site Neuroimaging Datasets , 2020, Medical Image Anal..

[21]  Andrew Smart,et al.  The Use and Misuse of Counterfactuals in Ethical Machine Learning , 2021, FAccT.

[22]  Bernhard Schölkopf,et al.  Towards a Learning Theory of Causation , 2015, 1502.02398.

[23]  Fei-Fei Li,et al.  Towards fairer datasets: filtering and balancing the distribution of the people subtree in the ImageNet hierarchy , 2019, FAT*.

[24]  Kimmo Kärkkäinen,et al.  FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age , 2019, ArXiv.

[25]  Jiebo Luo,et al.  Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark , 2016, AAAI.

[26]  Barbara Kitchenham,et al.  Procedures for Performing Systematic Reviews , 2004 .

[27]  Olga Russakovsky,et al.  REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets , 2020, International Journal of Computer Vision.

[28]  Lianwen Jin,et al.  SCUT-FBP5500: A Diverse Benchmark Dataset for Multi-Paradigm Facial Beauty Prediction , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[29]  Renita Coleman,et al.  Framing the Pictures in Our Heads: Exploring the Framing and Agenda-Setting Effects of Visual Images , 2009 .

[30]  Alexander Wong,et al.  Auditing ImageNet: Towards a Model-driven Framework for Annotating Demographic Attributes of Large-Scale Image Datasets , 2019, ArXiv.

[31]  Christopher T. Lowenkamp,et al.  False Positives, False Negatives, and False Analyses: A Rejoinder to "Machine Bias: There's Software Used across the Country to Predict Future Criminals. and It's Biased against Blacks" , 2016 .

[32]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[33]  Jieyu Zhao,et al.  Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints , 2017, EMNLP.

[34]  Joshua B. Tenenbaum,et al.  Learning to share visual appearance for multiclass object detection , 2011, CVPR 2011.

[35]  D. Sculley,et al.  No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World , 2017, 1711.08536.

[36]  Lior Shamir,et al.  Comparison of Data Set Bias in Object Recognition Benchmarks , 2015, IEEE Access.

[37]  Latanya Sweeney,et al.  Discrimination in online ad delivery , 2013, CACM.

[38]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[39]  Adriana Kovashka,et al.  Predicting the Politics of an Image Using Webly Supervised Data , 2019, NeurIPS.

[40]  Xose Manuel Pardo,et al.  Dataset bias exposed in face verification , 2019, IET Biom..

[41]  Chelsea A. Heuer,et al.  Obesity Stigma in Online News: A Visual Content Analysis , 2011, Journal of health communication.

[42]  Bernhard Schölkopf,et al.  Discovering Causal Signals in Images , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  D. Archer,et al.  Face-ism: Five studies of sex differences in facial prominence. , 1983 .

[44]  George K. Thiruvathukal,et al.  Crowdsourcing Detection of Sampling Biases in Image Datasets , 2020, WWW.

[45]  Dragomir Anguelov,et al.  Capturing Long-Tail Distributions of Object Subcategories , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Pietro Perona,et al.  Towards causal benchmarking of bias in face analysis algorithms , 2020, ECCV.

[47]  Felix M. Simon,et al.  Beyond (Mis)Representation: Visuals in COVID-19 Misinformation , 2020, The International Journal of Press/Politics.

[48]  C. Busch,et al.  Demographic Bias in Biometrics: A Survey on an Emerging Challenge , 2020, IEEE Transactions on Technology and Society.

[49]  Alexei A. Efros,et al.  Undoing the Damage of Dataset Bias , 2012, ECCV.

[50]  E. Rosenberg,et al.  Identifying a Facial Expression of Flirtation and Its Effect on Men , 2020, Journal of sex research.

[51]  Sean A. Munson,et al.  Unequal Representation and Gender Stereotypes in Image Search Results for Occupations , 2015, CHI.

[52]  Barbara Caputo,et al.  A Deeper Look at Dataset Bias , 2015, Domain Adaptation in Computer Vision Applications.

[53]  Alex Hanna,et al.  Documenting Computer Vision Datasets: An Invitation to Reflexive Data Practices , 2021, FAccT.

[54]  R. Merli,et al.  How do scholars approach the circular economy? A systematic literature review , 2017 .

[55]  Luke Zettlemoyer,et al.  Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles , 2020, FINDINGS.

[56]  Robert P. Bartlett,et al.  Consumer-Lending Discrimination in the Fintech Era , 2017, Journal of Financial Economics.

[57]  Milagros Miceli,et al.  Between Subjectivity and Imposition , 2020, Proc. ACM Hum. Comput. Interact..

[58]  Ofir Nachum,et al.  Identifying and Correcting Label Bias in Machine Learning , 2019, AISTATS.

[59]  Yun Fu,et al.  Face Recognition: Too Bias, or Not Too Bias? , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[60]  Boris Katz,et al.  ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models , 2019, NeurIPS.

[61]  Jack Bandy,et al.  Problematic Machine Behavior , 2021, Proc. ACM Hum. Comput. Interact..

[62]  Junmo Kim,et al.  Learning Not to Learn: Training Deep Neural Networks With Biased Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Amit K. Roy-Chowdhury,et al.  Contemplating Visual Emotions: Understanding and Overcoming Dataset Bias , 2018, ECCV.

[64]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[65]  V. Colonna,et al.  Genetic Basis of Human Biodiversity: An Update , 2011 .

[66]  Judy Hoffman,et al.  Predictive Inequity in Object Detection , 2019, ArXiv.

[67]  Anil K. Jain,et al.  Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Yilang Peng,et al.  Same Candidates, Different Faces: Uncovering Media Bias in Visual Portrayals of Presidential Candidates with Computer Vision , 2018, Journal of Communication.

[69]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[70]  Hanna M. Wallach,et al.  Measurement and Fairness , 2019, FAccT.

[71]  Byungjoo Lee,et al.  Quantification of Gender Representation Bias in Commercial Films based on Image Analysis , 2019, Proc. ACM Hum. Comput. Interact..

[72]  Vinay Uday Prabhu,et al.  Large image datasets: A pyrrhic win for computer vision? , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[73]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[74]  N. Schwarz,et al.  What's in a picture? The impact of face‐ism on trait attribution , 1989 .

[75]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[76]  Anil K. Jain,et al.  Face Recognition Performance: Role of Demographic Information , 2012, IEEE Transactions on Information Forensics and Security.

[77]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[78]  M. Pantic,et al.  Investigating Bias in Deep Face Analysis: The KANFace Dataset and Empirical Study , 2020, Image Vis. Comput..

[79]  Julia Rubin,et al.  Fairness Definitions Explained , 2018, 2018 IEEE/ACM International Workshop on Software Fairness (FairWare).

[80]  Aylin Caliskan,et al.  Image Representations Learned With Unsupervised Pre-Training Contain Human-like Biases , 2020, FAccT.

[81]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[82]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.