White-box Membership Inference Attacks against Diffusion Models

Diffusion models have begun to overshadow GANs and other generative models in industrial applications due to their superior image generation performance. The complex architecture of these models furnishes an extensive array of attack features. In light of this, we aim to design membership inference attacks (MIAs) catered to diffusion models. We first conduct an exhaustive analysis of existing MIAs on diffusion models, taking into account factors such as black-box/white-box models and the selection of attack features. We found that white-box attacks are highly applicable in real-world scenarios, and the most effective attacks presently are white-box. Departing from earlier research, which employs model loss as the attack feature for white-box MIAs, we employ model gradients in our attack, leveraging the fact that these gradients provide a more profound understanding of model responses to various samples. We subject these models to rigorous testing across a range of parameters, including training steps, sampling frequency, diffusion steps, and data variance. Across all experimental settings, our method consistently demonstrated near-flawless attack performance, with attack success rate approaching $100\%$ and attack AUCROC near $1.0$. We also evaluate our attack against common defense mechanisms, and observe our attacks continue to exhibit commendable performance.

[1]  Yufei Chen,et al.  Protecting the Intellectual Property of Diffusion Models by the Watermark Diffusion Process , 2023, ArXiv.

[2]  S. Tsaftaris,et al.  Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion Models , 2023, ArXiv.

[3]  Kaidi Xu,et al.  An Efficient Membership Inference Attack for the Diffusion Model by Proximal Initialization , 2023, ArXiv.

[4]  J. Susskind,et al.  Learning Controllable 3D Diffusion Models from Single-view Images , 2023, ArXiv.

[5]  Haitao Zheng,et al.  GLAZE: Protecting Artists from Style Mimicry by Text-to-Image Models , 2023, USENIX Security Symposium.

[6]  Naoto Yanai,et al.  Membership Inference Attacks against Diffusion Models , 2023, 2023 IEEE Security and Privacy Workshops (SPW).

[7]  S. Shakkottai,et al.  A Theoretical Justification for Image Inpainting using Denoising Diffusion Probabilistic Models , 2023, ArXiv.

[8]  Shiqi Wang,et al.  Are Diffusion Models Vulnerable to Membership Inference Attacks? , 2023, ICML.

[9]  Florian Tramèr,et al.  Extracting Training Data from Diffusion Models , 2023, USENIX Security Symposium.

[10]  Jun Pang,et al.  Membership Inference of Diffusion Models , 2023, ArXiv.

[11]  M. Backes,et al.  Membership Inference Attacks Against Text-to-image Generation Models , 2022, ArXiv.

[12]  M. Backes,et al.  Membership Inference Attacks by Exploiting Loss Trajectory , 2022, CCS.

[13]  Jonathan Ho Classifier-Free Diffusion Guidance , 2022, ArXiv.

[14]  Jing Yu Koh,et al.  Scaling Autoregressive Models for Content-Rich Text-to-Image Generation , 2022, Trans. Mach. Learn. Res..

[15]  David J. Fleet,et al.  Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.

[16]  Prafulla Dhariwal,et al.  Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.

[17]  David J. Fleet,et al.  Video Diffusion Models , 2022, NeurIPS.

[18]  L. Gool,et al.  RePaint: Inpainting using Denoising Diffusion Probabilistic Models , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  B. Ommer,et al.  High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Prafulla Dhariwal,et al.  GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models , 2021, ICML.

[21]  Florian Tramèr,et al.  Membership Inference Attacks From First Principles , 2021, 2022 IEEE Symposium on Security and Privacy (SP).

[22]  Fang Wen,et al.  Vector Quantized Diffusion Model for Text-to-Image Synthesis , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  R. Shokri,et al.  Enhanced Membership Inference Attacks against Machine Learning Models , 2021, CCS.

[24]  Graham Cormode,et al.  On the Importance of Difficulty Calibration in Membership Inference Attacks , 2021, ICLR.

[25]  Jun Pang,et al.  Membership Inference Attacks against GANs by Leveraging Over-representation Regions , 2021, CCS.

[26]  David J. Fleet,et al.  Palette: Image-to-Image Diffusion Models , 2021, SIGGRAPH.

[27]  Jong-Chul Ye,et al.  DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  S. Ermon,et al.  SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations , 2021, ICLR.

[29]  David J. Fleet,et al.  Cascaded Diffusion Models for High Fidelity Image Generation , 2021, J. Mach. Learn. Res..

[30]  Chang Zhou,et al.  CogView: Mastering Text-to-Image Generation via Transformers , 2021, NeurIPS.

[31]  Prafulla Dhariwal,et al.  Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.

[32]  David J. Fleet,et al.  Image Super-Resolution via Iterative Refinement , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Alec Radford,et al.  Zero-Shot Text-to-Image Generation , 2021, ICML.

[34]  N. Gong,et al.  Practical Blind Membership Inference Attack via Differential Comparisons , 2021, NDSS.

[35]  Abhishek Kumar,et al.  Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.

[36]  Jiaming Song,et al.  Denoising Diffusion Implicit Models , 2020, ICLR.

[37]  Nicolas Papernot,et al.  Label-Only Membership Inference Attacks , 2020, ICML.

[38]  Pieter Abbeel,et al.  Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[39]  Xin Liu,et al.  Towards the Infeasibility of Membership Inference on Deep Models , 2020, ArXiv.

[40]  Liwei Song,et al.  Systematic Evaluation of Privacy Risks of Machine Learning Models , 2020, USENIX Security Symposium.

[41]  Bruno Ribeiro,et al.  Membership Inference Attacks and Defenses in Classification Models , 2020, CODASPY.

[42]  Juan Lavista Ferres,et al.  privGAN: Protecting GANs from membership inference attacks at low cost to utility , 2019, Proc. Priv. Enhancing Technol..

[43]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[44]  Daniel Bernau,et al.  Monte Carlo and Reconstruction Membership Inference Attacks against Generative Models , 2019, Proc. Priv. Enhancing Technol..

[45]  Quoc V. Le,et al.  Randaugment: Practical automated data augmentation with a reduced search space , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[46]  Mario Fritz,et al.  GAN-Leaks: A Taxonomy of Membership Inference Attacks against Generative Models , 2019, CCS.

[47]  Yang Song,et al.  Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.

[48]  Cordelia Schmid,et al.  White-box vs Black-box: Bayes Optimal Strategies for Membership Inference , 2019, ICML.

[49]  Malcolm J. Hawksford,et al.  High Resolution , 2019, Colorado Review.

[50]  Wenqi Wei,et al.  Demystifying Membership Inference Attacks in Machine Learning as a Service , 2019, IEEE Transactions on Services Computing.

[51]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[52]  Nikita Borisov,et al.  Property Inference Attacks on Fully Connected Neural Networks using Permutation Invariant Representations , 2018, CCS.

[53]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[54]  Mario Fritz,et al.  ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models , 2018, NDSS.

[55]  Matthew D. Hoffman,et al.  Variational Autoencoders for Collaborative Filtering , 2018, WWW.

[56]  Tom White,et al.  Generative Adversarial Networks: An Overview , 2017, IEEE Signal Processing Magazine.

[57]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[58]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[59]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[60]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[61]  Surya Ganguli,et al.  Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.

[62]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[63]  S. Nelson,et al.  Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays , 2008, PLoS genetics.

[64]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[65]  Emiliano De Cristofaro,et al.  : Membership Inference Attacks Against Generative Models , 2018 .

[66]  Rahul Jain,et al.  Theory and Applications of Models of Computation , 2012, Lecture Notes in Computer Science.