GLAZE: Protecting Artists from Style Mimicry by Text-to-Image Models

Recent text-to-image diffusion models such as MidJourney and Stable Diffusion threaten to displace many in the professional artist community. In particular, models can learn to mimic the artistic style of specific artists after"fine-tuning"on samples of their art. In this paper, we describe the design, implementation and evaluation of Glaze, a tool that enables artists to apply"style cloaks"to their art before sharing online. These cloaks apply barely perceptible perturbations to images, and when used as training data, mislead generative models that try to mimic a specific artist. In coordination with the professional artist community, we deploy user studies to more than 1000 artists, assessing their views of AI art, as well as the efficacy of our tool, its usability and tolerability of perturbations, and robustness across different scenarios and against adaptive countermeasures. Both surveyed artists and empirical CLIP-based scores show that even at low perturbation levels (p=0.05), Glaze is highly successful at disrupting mimicry under normal conditions (>92%) and against adaptive countermeasures (>85%).

[1]  A. Madry,et al.  Raising the Cost of Malicious AI-Powered Image Editing , 2023, ICML.

[2]  T. Goldstein,et al.  Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery , 2023, ArXiv.

[3]  Haoqi Fan,et al.  Scaling Language-Image Pre-Training via Masking , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Bryan Catanzaro,et al.  eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers , 2022, ArXiv.

[5]  M. Irani,et al.  Imagic: Text-Based Real Image Editing with Diffusion Models , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Ludwig Schmidt,et al.  LAION-5B: An open large-scale dataset for training next generation image-text models , 2022, NeurIPS.

[7]  Diederik P. Kingma,et al.  On Distillation of Guided Diffusion Models , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  William W. Cohen,et al.  Re-Imagen: Retrieval-Augmented Text-to-Image Generator , 2022, ICLR.

[9]  Yuanzhen Li,et al.  DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Amit H. Bermano,et al.  An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion , 2022, ICLR.

[11]  David J. Fleet,et al.  Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.

[12]  Ben Y. Zhao,et al.  Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models , 2022, CCS.

[13]  Prafulla Dhariwal,et al.  Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.

[14]  Tero Karras,et al.  The Role of ImageNet Classes in Fréchet Inception Distance , 2022, ICLR.

[15]  B. Ommer,et al.  High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  David J. Fleet,et al.  Palette: Image-to-Image Diffusion Models , 2021, SIGGRAPH.

[17]  Florian Tramèr,et al.  Data Poisoning Won't Save You From Facial Recognition , 2021, ICLR.

[18]  Chang Zhou,et al.  CogView: Mastering Text-to-Image Generation via Transformers , 2021, NeurIPS.

[19]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[20]  Alec Radford,et al.  Zero-Shot Text-to-Image Generation , 2021, ICML.

[21]  Radu Soricut,et al.  Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Quoc V. Le,et al.  Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision , 2021, ICML.

[23]  Micah Goldblum,et al.  LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition , 2021, ICLR.

[24]  Pascal Sturmfels,et al.  FoggySight: A Scheme for Facial Lookup Privacy , 2020, Proc. Priv. Enhancing Technol..

[25]  Feng Wang,et al.  Understanding the Behaviour of Contrastive Loss , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Ismail Ben Ayed,et al.  Augmented Lagrangian Adversarial Attacks , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Xiaoyuan Jing,et al.  DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis , 2020, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Sahil Singla,et al.  Perceptual Adversarial Robustness: Defense Against Unseen Threat Models , 2020, ICLR.

[29]  Tero Karras,et al.  Training Generative Adversarial Networks with Limited Data , 2020, NeurIPS.

[30]  Somesh Jha,et al.  Face-Off: Adversarial Face Obfuscation , 2020, Proc. Priv. Enhancing Technol..

[31]  Hamid R. Rabiee,et al.  ARAE: Adversarially Robust Training of Autoencoders Improves Novelty Detection , 2020, Neural Networks.

[32]  Ben Y. Zhao,et al.  Fawkes: Protecting Privacy against Unauthorized Deep Learning Models , 2020, USENIX Security Symposium.

[33]  Ben Y. Zhao,et al.  Gotta Catch'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks , 2019, CCS.

[34]  Wei Chen,et al.  DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-To-Image Synthesis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Ben Y. Zhao,et al.  Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[36]  Yannis Avrithis,et al.  Smooth adversarial examples , 2019, EURASIP J. Inf. Secur..

[37]  Lei Zhang,et al.  Object-Driven Text-To-Image Synthesis via Adversarial Training , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Fabio Roli,et al.  Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks , 2018, USENIX Security Symposium.

[39]  Radu Soricut,et al.  Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning , 2018, ACL.

[40]  Eduardo Valle,et al.  Adversarial Attacks on Variational Autoencoders , 2018, LatinX in AI at Neural Information Processing Systems Conference 2018.

[41]  Tudor Dumitras,et al.  When Does Machine Learning FAIL? Generalized Transferability for Evasion and Poisoning Attacks , 2018, USENIX Security Symposium.

[42]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Zhe Gan,et al.  AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  Y. Blau,et al.  The Perception-Distortion Tradeoff , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[46]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[47]  Dawn Xiaodong Song,et al.  Adversarial Examples for Generative Models , 2017, 2018 IEEE Security and Privacy Workshops (SPW).

[48]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Eduardo Valle,et al.  Adversarial Images for Variational Autoencoders , 2016, ArXiv.

[50]  Robert Pless,et al.  Deep Feature Interpolation for Image Content Changes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Tom White,et al.  Sampling Generative Networks: Notes on a Few Effective Techniques , 2016, ArXiv.

[52]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[53]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[54]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[55]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[56]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[57]  David J. Fleet,et al.  Adversarial Manipulation of Deep Representations , 2015, ICLR.

[58]  Ruslan Salakhutdinov,et al.  Generating Images from Captions with Attention , 2015, ICLR.

[59]  Babak Saleh,et al.  Large-scale Classification of Fine-Art Paintings: Learning The Right Metric on The Right Feature , 2015, ArXiv.

[60]  David A. Shamma,et al.  YFCC100M , 2015, Commun. ACM.

[61]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[62]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[63]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[64]  Rongsheng Zhang,et al.  VC-GPT: Visual Conditioned GPT for End-to-End Generative Vision-and-Language Pre-training , 2022, ArXiv.

[65]  A. Linear-probe,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021 .

[66]  Yan Ke,et al.  Efficient Near-duplicate Detection and Sub-image Retrieval , 2004 .

[67]  Wei Xu,et al.  Improving one-class SVM for anomaly detection , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).