Investigating Data Memorization in 3D Latent Diffusion Models for Medical Image Synthesis

Generative latent diffusion models have been established as state-of-the-art in data generation. One promising application is generation of realistic synthetic medical imaging data for open data sharing without compromising patient privacy. Despite the promise, the capacity of such models to memorize sensitive patient training data and synthesize samples showing high resemblance to training data samples is relatively unexplored. Here, we assess the memorization capacity of 3D latent diffusion models on photon-counting coronary computed tomography angiography and knee magnetic resonance imaging datasets. To detect potential memorization of training samples, we utilize self-supervised models based on contrastive learning. Our results suggest that such latent diffusion models indeed memorize training data, and there is a dire need for devising strategies to mitigate memorization.

[1]  T. Goldstein,et al.  Understanding and Mitigating Copying in Diffusion Models , 2023, ArXiv.

[2]  A. Eklund,et al.  Beware of diffusion models for synthesizing medical images - A comparison with GANs in terms of memorizing brain tumor images , 2023, ArXiv.

[3]  Ehsan Khodapanah Aghdam,et al.  Diffusion models in medical imaging: A comprehensive survey. , 2023, Medical image analysis.

[4]  Florian Tramèr,et al.  Extracting Training Data from Diffusion Models , 2023, USENIX Security Symposium.

[5]  T. Goldstein,et al.  Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Jakob Nikolas Kather,et al.  Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation , 2022, 2211.03364.

[7]  S. Ourselin,et al.  Brain Imaging Generation with Latent Diffusion Models , 2022, DGM4MICCAI@MICCAI.

[8]  Onat Dalmaz,et al.  Unsupervised Medical Image Translation With Adversarial Diffusion Models , 2022, IEEE Transactions on Medical Imaging.

[9]  Alper Gungor,et al.  Adaptive Diffusion Priors for Accelerated MRI Reconstruction , 2022, Medical Image Anal..

[10]  P. Cattin,et al.  Diffusion Models for Medical Anomaly Detection , 2022, MICCAI.

[11]  B. Ommer,et al.  High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Philippe C. Cattin,et al.  Diffusion Models for Implicit Image Segmentation Ensembles , 2021, MIDL.

[13]  Pieter Abbeel,et al.  Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[14]  Lena Maier-Hein,et al.  Generating large labeled data sets for laparoscopic image processing tasks using unpaired image-to-image translation , 2019, MICCAI.

[15]  Ivo Wolf,et al.  Cross-Domain Conditional Generative Adversarial Networks for Stereoscopic Hyperrealism in Surgical Training , 2019, MICCAI.

[16]  A. Ng,et al.  Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet , 2018, PLoS medicine.

[17]  Paul Babyn,et al.  Generative Adversarial Network in Medical Imaging: A Review , 2018, Medical Image Anal..

[18]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Furen Xiao,et al.  Three-Dimensional Medical Image Synthesis with Denoising Diffusion Probabilistic Models , 2022 .