Large image datasets: A pyrrhic win for computer vision?

In this paper we investigate problematic practices and consequences of large scale vision datasets (LSVDs). We examine broad issues such as the question of consent and justice as well as specific concerns such as the inclusion of verifiably pornographic images in datasets. Taking the ImageNet-ILSVRC-2012 dataset as an example, we perform a cross-sectional model-based quantitative census covering factors such as age, gender, NSFW content scoring, class- wise accuracy, human-cardinality-analysis, and the semanticity of the image class information in order to statistically investigate the extent and subtleties of ethical transgressions. We then use the census to help hand-curate a look-up-table of images in the ImageNet-ILSVRC-2012 dataset that fall into the categories of verifiably pornographic: shot in a non-consensual setting (up-skirt), beach voyeuristic, and exposed private parts. We survey the landscape of harm and threats both the society at large and individuals face due to uncritical and ill-considered dataset curation practices. We then propose possible courses of correction and critique their pros and cons. We have duly open-sourced all of the code and the census meta-datasets generated in this endeavor for the computer vision community to build on. By unveiling the severity of the threats, our hope is to motivate the constitution of mandatory Institutional Review Boards (IRB) for large scale dataset curation.

[1]  Seymour A. Papert,et al.  The Summer Vision Project , 1966 .

[2]  D. E. Rogers,et al.  Where have we been? Where are we going? , 1986, Daedalus.

[3]  Judith K. Delzell,et al.  Gender Association of Musical Instruments and Preferences of Fourth-Grade Students for Selected Instruments , 1992 .

[4]  Judith M. Tanur,et al.  Gender and Musical Instruments: Winds of Change? Jason Zervoudakes , 1994 .

[5]  E. Hirschman Consumers and Their Animal Companions , 1994 .

[6]  J. Overhage,et al.  Sorting Things Out: Classification and Its Consequences , 2001, Annals of Internal Medicine.

[7]  Paul Weindling,et al.  The Origins of Informed Consent: The International Scientific Commission on Medical War Crimes, and the Nuremberg Code , 2001, Bulletin of the history of medicine.

[8]  J. Tanner,et al.  Informed Consent the Global Picture , 2002, British journal of perioperative nursing : the journal of the National Association of Theatre Nurses.

[9]  Robert Eaglestone,et al.  One and the Same? Ethics, Aesthetics, and Truth , 2004 .

[10]  Julie L. Fishman Is Diamond Smuggling Forever? The Kimberley Process Certification Scheme: The First Step Down the Long Road to Solving the Blood Diamond Trade Problem , 2005 .

[11]  Michael Ramirez,et al.  “My Dog's Just Like Me”: Dog Ownership as a Gender Display , 2006 .

[12]  Lucy Suchman,et al.  Human-Machine Reconfigurations: Plans and Situated Actions , 2006 .

[13]  Geoffrey C. Bowker,et al.  Enacting silence: Residual categories as a challenge for ethics, information systems, and communication , 2007, Ethics and Information Technology.

[14]  Manamai Ozaki,et al.  Shashinjinsei: Nobuyoshi Araki's Photo Journey Art and Not or Pornography , 2008 .

[15]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[16]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[17]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[18]  Susan Corbett,et al.  Creative Commons Licences: A Symptom or a Cause? , 2009 .

[19]  Ernest A. Edmonds,et al.  What is generative art? , 2009, Digit. Creativity.

[20]  S. Naidoo,et al.  Informed consent for photography in dental practice : communication , 2009 .

[21]  A. Powell Configuring Consent: Emerging Technologies, Unauthorized Sexual Images and Sexual Assault , 2010 .

[22]  Jessica M. Coates,et al.  The School Girl, the Billboard, and Virgin: The Virgin Mobile Case and the Use of Creative Commons Licensed Photographs by Commercial Entities , 2011 .

[23]  Herkko Hietanen,et al.  Creative Commons Olympics How Big Media is Learning to License from Amateur Authors , 2011 .

[24]  Susan Corbett,et al.  Creative Commons Licences, the Copyright Regime and the Online Community: Is There a Fatal Disconnect? , 2011 .

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[27]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[28]  Sean A. Munson,et al.  Unequal Representation and Gender Stereotypes in Image Search Results for Occupations , 2015, CHI.

[29]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[30]  Luc Van Gool,et al.  Deep Expectation of Real and Apparent Age from a Single Image Without Facial Landmarks , 2016, International Journal of Computer Vision.

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  J. Callahan,et al.  Gender and Musical Instrument Stereotypes in Middle School Children , 2016 .

[33]  Erika Rackley,et al.  More than 'Revenge Porn' : image-based sexual abuse and the reform of Irish law. , 2017 .

[34]  Vitaly Shmatikov,et al.  Machine Learning Models that Remember Too Much , 2017, CCS.

[35]  C. McGlynn,et al.  Image-based sexual abuse. , 2017 .

[36]  Tony Doyle,et al.  Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2017, Inf. Soc..

[37]  Asher Flynn,et al.  Not Just 'Revenge Pornography': Australians' Experiences of Image-Based Abuse: A Summary Report , 2017 .

[38]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Stephanie Baran,et al.  Visual patriarchy: PETA advertising and the commodification of sexualized bodies , 2017 .

[40]  C. McGlynn,et al.  Beyond ‘Revenge Porn’: The Continuum of Image-Based Sexual Abuse , 2017 .

[41]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  D. Sculley,et al.  No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World , 2017, 1711.08536.

[43]  Alexei A. Efros,et al.  Dataset Distillation , 2018, ArXiv.

[44]  Liyue Fan,et al.  Image Pixelization with Differential Privacy , 2018, DBSec.

[45]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[46]  Moustapha Cissé,et al.  ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases , 2017, ECCV.

[47]  D. Fitch,et al.  Review of "Algorithms of oppression: how search engines reinforce racism," by Noble, S. U. (2018). New York, New York: NYU Press. , 2018, CDQR.

[48]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[49]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[50]  Virginia E. Eubanks Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor , 2018 .

[51]  Vijayan K. Asari,et al.  The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches , 2018, ArXiv.

[52]  Kate Saenko,et al.  VisDA: A Synthetic-to-Real Benchmark for Visual Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[53]  Hannah Lebovits Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor , 2018, Public Integrity.

[54]  A. Hanbury,et al.  Measuring Societal Biases in Text Corpora via First-Order Co-occurrence , 2018 .

[55]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Yi Chern Tan,et al.  Assessing Social and Intersectional Biases in Contextualized Word Representations , 2019, NeurIPS.

[57]  A. Gillespíe,et al.  Tackling Voyeurism: Is the Voyeurism (Offences) Act 2019 a Wasted Opportunity? , 2019, The Modern Law Review.

[58]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[59]  Judy Hoffman,et al.  Predictive Inequity in Object Detection , 2019, ArXiv.

[60]  Alexander Wong,et al.  Auditing ImageNet: Towards a Model-driven Framework for Annotating Demographic Attributes of Large-Scale Image Datasets , 2019, ArXiv.

[61]  Mary L. Gray,et al.  Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass , 2019 .

[62]  Eric P. Xing,et al.  Learning Robust Global Representations by Penalizing Local Predictive Power , 2019, NeurIPS.

[63]  Benjamin Recht,et al.  Do ImageNet Classifiers Generalize to ImageNet? , 2019, ICML.

[64]  Inioluwa Deborah Raji,et al.  Model Cards for Model Reporting , 2018, FAT.

[65]  Alexander L. Brown,et al.  Why Do People Volunteer? An Experimental Analysis of Preferences for Time Donations , 2013, Manag. Sci..

[66]  Claus Aranha,et al.  Data Augmentation Using GANs , 2019, ArXiv.

[67]  Yoav Goldberg,et al.  Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them , 2019, NAACL-HLT.

[68]  Luciano Floridi,et al.  Translating Principles into Practices of Digital Ethics: Five Risks of Being Unethical , 2019, Philosophy & Technology.

[69]  Abeba Birhane,et al.  Algorithmic Injustices: Towards a Relational Ethics , 2019, ArXiv.

[70]  Jed R. Brubaker,et al.  How Computers See Gender , 2019, Proc. ACM Hum. Comput. Interact..

[71]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Baoyuan Wu,et al.  Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning , 2019, IEEE Access.

[73]  Stefanos Zafeiriou,et al.  RetinaFace: Single-stage Dense Face Localisation in the Wild , 2019, ArXiv.

[74]  Aaron Hertzmann,et al.  Aesthetics of Neural Network Art , 2019, ArXiv.

[75]  Luc Rocher,et al.  Estimating the success of re-identifications in incomplete datasets using generative models , 2019, Nature Communications.

[76]  Vinay Uday Prabhu,et al.  Fonts-2-Handwriting: A Seed-Augment-Train framework for universal digit classification , 2019, ArXiv.

[77]  Alexander Wong,et al.  Investigating the Impact of Inclusion in Face Recognition Training Data on Individual Face Identification , 2020, AIES.

[78]  S. Merz Race after technology. Abolitionist tools for the new Jim Code , 2020, Ethnic and Racial Studies.

[79]  C. Rudin,et al.  PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[80]  Tom B. Brown,et al.  Measuring the Algorithmic Efficiency of Neural Networks , 2020, ArXiv.

[81]  Aleksander Madry,et al.  From ImageNet to Image Classification: Contextualizing Progress on Benchmarks , 2020, ICML.

[82]  Natalia Kovalyova,et al.  Data feminism , 2020, Information, Communication & Society.

[83]  Fei-Fei Li,et al.  Towards fairer datasets: filtering and balancing the distribution of the people subtree in the ImageNet hierarchy , 2019, FAT*.

[84]  Xiaohua Zhai,et al.  Are we done with ImageNet? , 2020, ArXiv.

[85]  Timnit Gebru,et al.  Datasheets for datasets , 2018, Commun. ACM.

[86]  Rediet Abebe,et al.  Fairness, Equality, and Power in Algorithmic Decision-Making , 2021, FAccT.