Stable Bias: Analyzing Societal Representations in Diffusion Models

As machine learning-enabled Text-to-Image (TTI) systems are becoming increasingly prevalent and seeing growing adoption as commercial services, characterizing the social biases they exhibit is a necessary first step to lowering their risk of discriminatory outcomes. This evaluation, however, is made more difficult by the synthetic nature of these systems' outputs; since artificial depictions of fictive humans have no inherent gender or ethnicity nor do they belong to socially-constructed groups, we need to look beyond common categorizations of diversity or representation. To address this need, we propose a new method for exploring and quantifying social biases in TTI systems by directly comparing collections of generated images designed to showcase a system's variation across social attributes -- gender and ethnicity -- and target attributes for bias evaluation -- professions and gender-coded adjectives. Our approach allows us to (i) identify specific bias trends through visualization tools, (ii) provide targeted scores to directly compare models in terms of diversity and representation, and (iii) jointly model interdependent social variables to support a multidimensional analysis. We use this approach to analyze over 96,000 images generated by 3 popular TTI systems (DALL-E 2, Stable Diffusion v 1.4 and v 2) and find that all three significantly over-represent the portion of their latent space associated with whiteness and masculinity across target attributes; among the systems studied, DALL-E 2 shows the least diversity, followed by Stable Diffusion v2 then v1.4.

[1]  Florian Tramèr,et al.  Extracting Training Data from Diffusion Models , 2023, USENIX Security Symposium.

[2]  P. Schramowski,et al.  The Stable Artist: Steering Semantics in Diffusion Latent Space , 2022, ArXiv.

[3]  T. Goldstein,et al.  Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Björn Deiseroth,et al.  Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  James Y. Zou,et al.  Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale , 2022, ArXiv.

[6]  Kai-Wei Chang,et al.  How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions? , 2022, EMNLP.

[7]  Thao Phan,et al.  A Sign That Spells: DALL-E 2, Invisual Images and The Racial Politics of Feature Space , 2022, ArXiv.

[8]  Ludwig Schmidt,et al.  LAION-5B: An open large-scale dataset for training next generation image-text models , 2022, NeurIPS.

[9]  Florian Tramèr,et al.  Red-Teaming the Stable Diffusion Safety Filter , 2022, ArXiv.

[10]  Shoaib Ahmed Siddiqui,et al.  Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics , 2022, ICLR.

[11]  Troy Luhman,et al.  Improving Diffusion Model Efficiency Through Patching , 2022, ArXiv.

[12]  Aylin Caliskan,et al.  American == White in Multimodal Language-and-Image AI , 2022, AIES.

[13]  William Agnew,et al.  Robots Enact Malignant Stereotypes , 2022, FAccT.

[14]  Aylin Caliskan,et al.  Markedness in Visual Semantic AI , 2022, FAccT.

[15]  David J. Fleet,et al.  Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.

[16]  Olga Russakovsky,et al.  Towards Intersectionality in Machine Learning: Including More Identities, Handling Underrepresentation, and Performing Evaluation , 2022, FAccT.

[17]  Prafulla Dhariwal,et al.  Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.

[18]  Yaniv Taigman,et al.  Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors , 2022, ECCV.

[19]  Hannah Rose Kirk,et al.  A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning , 2022, AACL.

[20]  S. Hoi,et al.  BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation , 2022, ICML.

[21]  B. Ommer,et al.  High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  William Agnew,et al.  The Values Encoded in Machine Learning Research , 2021, FAccT.

[23]  Jialu Wang,et al.  Assessing Multilingual Fairness in Pre-trained Multimodal Representations , 2021, FINDINGS.

[24]  Albert Gordo,et al.  Towards Measuring Fairness in AI: The Casual Conversations Dataset , 2021, IEEE Transactions on Biometrics, Behavior, and Identity Science.

[25]  Mohit Bansal,et al.  DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers , 2022, ArXiv.

[26]  Jenia Jitsev,et al.  LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs , 2021, ArXiv.

[27]  Vinay Uday Prabhu,et al.  Multimodal datasets: misogyny, pornography, and malignant stereotypes , 2021, ArXiv.

[28]  Jialu Wang,et al.  Are Gender-Neutral Queries Really Gender-Neutral? Mitigating Gender Bias in Image Search , 2021, EMNLP.

[29]  Alec Radford,et al.  Evaluating CLIP: Towards Characterization of Broader Capabilities and Downstream Implications , 2021, ArXiv.

[30]  Olga Russakovsky,et al.  Understanding and Evaluating Racial Biases in Image Captioning , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Jesse Dodge,et al.  Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus , 2021, EMNLP.

[32]  Margaret Mitchell,et al.  Measuring Model Biases in the Absence of Ground Truth , 2021, AIES.

[33]  Emily M. Bender,et al.  On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.

[34]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[35]  W. Mcfarland,et al.  Experiences and factors associated with transphobic hate crimes among transgender women in the San Francisco Bay Area: comparisons across race , 2021, BMC Public Health.

[36]  Ramya Srinivasan,et al.  Biases in Generative Art: A Causal Look from the Lens of Art History , 2020, FAccT.

[37]  Hanna M. Wallach,et al.  Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets , 2021, ACL.

[38]  Michelle A. Gan,et al.  An Image of Society: Gender and Racial Representation and Impact in Image Search Results for Occupations , 2021 .

[39]  Markedness Theory , 2020 .

[40]  Solon Barocas,et al.  Language (Technology) is Power: A Critical Survey of “Bias” in NLP , 2020, ACL.

[41]  Angela D. R. Smith,et al.  Critical Race Theory for HCI , 2020, CHI.

[42]  Emily Denton,et al.  Diversity and Inclusion Metrics in Subset Selection , 2020, AIES.

[43]  S. Merz Race after technology. Abolitionist tools for the new Jim Code , 2020, Ethnic and Racial Studies.

[44]  Vivek Singh,et al.  Female librarians and male computer programmers? Gender bias in occupational images on digital media platforms , 2019, J. Assoc. Inf. Sci. Technol..

[45]  Luca Oneto,et al.  Fairness in Machine Learning , 2020, INNSBDDL.

[46]  Shruti Bhargava,et al.  Exposing and Correcting the Gender Bias in Image Captioning Datasets and Models , 2019, ArXiv.

[47]  Sebastian Benthall,et al.  Racial categories in machine learning , 2018, FAT.

[48]  Sarthak Mittal,et al.  Diffusion Models , 2018, Financial Econometrics.

[49]  Trevor Darrell,et al.  Women also Snowboard: Overcoming Bias in Captioning Models , 2018, ECCV.

[50]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[51]  Jieyu Zhao,et al.  Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints , 2017, EMNLP.

[52]  Yash Goyal,et al.  Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering , 2016, International Journal of Computer Vision.

[53]  Carolyn A. Liebler,et al.  America’s Churning Races: Race and Ethnicity Response Changes Between Census 2000 and the 2010 Census , 2017, Demography.

[54]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[55]  Sean A. Munson,et al.  Unequal Representation and Gender Stereotypes in Image Search Results for Occupations , 2015, CHI.

[56]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[57]  Andrew M. Penner,et al.  Engendering Racial Perceptions , 2013 .

[58]  H. Reis,et al.  Journal of Personality and Social Psychology Men and Women Are From Earth : Examining the Latent Structure of Gender , 2012 .

[59]  Aaron C. Kay,et al.  Journal of Personality and Social Psychology Evidence That Gendered Wording in Job Advertisements Exists and Sustains Gender Inequality , 2011 .

[60]  Kai Li,et al.  Efficient k-nearest neighbor graph construction for generic similarity measures , 2011, WWW.

[61]  Denada Hoxha,et al.  Gendered Racial Identity of Black Young Women , 2011 .

[62]  B. Smedley,et al.  Race as biology is fiction, racism as a social problem is real: Anthropological and historical perspectives on the social construction of race. , 2005, The American psychologist.

[63]  G. LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[64]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[65]  Sabine Süsstrunk,et al.  Measuring colorfulness in natural images , 2003, IS&T/SPIE Electronic Imaging.

[66]  J. Overhage,et al.  Sorting Things Out: Classification and Its Consequences , 2001, Annals of Internal Medicine.

[67]  Richard Jenkins,et al.  Rethinking ethnicity: Identity, categorization and power , 1994 .

[68]  K. Crenshaw Mapping the margins: intersectionality, identity politics, and violence against women of color , 1991 .

[69]  J. Lorber,et al.  The social construction of gender. , 1991 .

[70]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[71]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .