CamDiff: Camouflage Image Augmentation via Diffusion Model

The burgeoning field of camouflaged object detection (COD) seeks to identify objects that blend into their surroundings. Despite the impressive performance of recent models, we have identified a limitation in their robustness, where existing methods may misclassify salient objects as camouflaged ones, despite these two characteristics being contradictory. This limitation may stem from lacking multi-pattern training images, leading to less saliency robustness. To address this issue, we introduce CamDiff, a novel approach inspired by AI-Generated Content (AIGC) that overcomes the scarcity of multi-pattern training images. Specifically, we leverage the latent diffusion model to synthesize salient objects in camouflaged scenes, while using the zero-shot image classification ability of the Contrastive Language-Image Pre-training (CLIP) model to prevent synthesis failures and ensure the synthesized object aligns with the input prompt. Consequently, the synthesized image retains its original camouflage label while incorporating salient objects, yielding camouflage samples with richer characteristics. The results of user studies show that the salient objects in the scenes synthesized by our framework attract the user's attention more; thus, such samples pose a greater challenge to the existing COD models. Our approach enables flexible editing and efficient large-scale dataset generation at a low cost. It significantly enhances COD baselines' training and testing phases, emphasizing robustness across diverse domains. Our newly-generated datasets and source code are available at https://github.com/drlxj/CamDiff.

[1]  Zhongang Cai,et al.  ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Vicky S. Kalogeiton,et al.  One-shot Unsupervised Domain Adaptation with Personalized Diffusion Models , 2023, ArXiv.

[3]  Huaixin Chen,et al.  Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Chi-Man Pun,et al.  Explicit Visual Prompting for Low-Level Structure Segmentations , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Philip S. Yu,et al.  A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT , 2023, ArXiv.

[6]  Ming-Hsuan Yang,et al.  InfiniCity: Infinite-Scale City Synthesis , 2023, ArXiv.

[7]  Song-Chun Zhu,et al.  Diffusion-based Generation, Optimization, and Planning in 3D Scenes , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  L. Gool,et al.  Source-free Depth for Object Pop-out , 2022, ArXiv.

[9]  Ben Poole,et al.  DreamFusion: Text-to-3D using 2D Diffusion , 2022, ICLR.

[10]  Rynson W. H. Lau,et al.  Weakly-Supervised Camouflaged Object Detection with Scribble Annotations , 2022, AAAI.

[11]  Brian L. Trippe,et al.  Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem , 2022, ICLR.

[12]  Shuang Wu,et al.  Detecting Camouflaged Object in Frequency Domain , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Xin Fan,et al.  Segment, Magnify and Reiterate: Detecting Camouflaged Objects the Hard Way , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Prafulla Dhariwal,et al.  Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.

[15]  Ling Shao,et al.  High-resolution Iterative Feedback Network for Camouflaged Object Detection , 2022, AAAI.

[16]  Zhengjun Zha,et al.  Location-Free Camouflage Generation Network , 2022, IEEE Transactions on Multimedia.

[17]  S. Mandt,et al.  Diffusion Probabilistic Modeling for Video Generation , 2022, Entropy.

[18]  Huchuan Lu,et al.  Zoom In and Out: A Mixed-scale Triplet Network for Camouflaged Object Detection , 2022, 2203.02688.

[19]  Xiankai Lu,et al.  CubeNet: X-shape connection for camouflaged object detection , 2022, Pattern Recognit..

[20]  B. Ommer,et al.  High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Dat T. Huynh,et al.  Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  David J. Fleet,et al.  Palette: Image-to-Image Diffusion Models , 2021, SIGGRAPH.

[23]  S. Ermon,et al.  SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations , 2021, ICLR.

[24]  Ming-Ming Cheng,et al.  Concealed Object Detection , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Christopher D. Manning,et al.  Contrastive Learning of Medical Visual Representations from Paired Images and Text , 2020, MLHC.

[26]  F. Yang,et al.  Uncertainty-Guided Transformer Reasoning for Camouflaged Object Detection , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  H. Fu,et al.  Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers , 2021, CAAI Artificial Intelligence Research.

[28]  Tao Zhou,et al.  Context-aware Cross-level Fusion Network for Camouflaged Object Detection , 2021, IJCAI.

[29]  Prafulla Dhariwal,et al.  Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.

[30]  Xiaopeng Wei,et al.  Camouflaged Object Segmentation with Distraction Mining , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Jiajun Wu,et al.  3D Shape Generation and Completion through Point-Voxel Diffusion , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Yuchao Dai,et al.  Uncertainty-aware Joint Salient Object and Camouflaged Object Detection , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Yuchao Dai,et al.  Simultaneously Localize, Segment and Rank the Camouflaged Objects , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Sander Dieleman,et al.  Generating Images with Sparse Representations , 2021, ICML.

[35]  Shitong Luo,et al.  Diffusion Probabilistic Models for 3D Point Cloud Generation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[37]  Ying Gu,et al.  A novel hybrid approach for crack detection , 2020, Pattern Recognit..

[38]  Jaesik Park,et al.  ContraGAN: Contrastive Learning for Conditional Image Generation , 2020, Neural Information Processing Systems.

[39]  Pieter Abbeel,et al.  Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[40]  Ling Shao,et al.  Camouflaged Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Chen Sun,et al.  What makes for good views for contrastive learning , 2020, NeurIPS.

[42]  Ce Liu,et al.  Supervised Contrastive Learning , 2020, NeurIPS.

[43]  Deniz Korkmaz,et al.  COVIDiagnosis-Net: Deep Bayes-SqueezeNet based diagnosis of the coronavirus disease 2019 (COVID-19) from X-ray images , 2020, Medical Hypotheses.

[44]  Yongwei Nie,et al.  Deep Camouflage Images , 2020, AAAI.

[45]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[46]  Trung-Nghia Le,et al.  Anabranch network for camouflaged object segmentation , 2019, Comput. Vis. Image Underst..

[47]  Ilangko Balasingham,et al.  Polyp Detection and Segmentation using Mask R-CNN: Does a Deeper Feature Extractor CNN Always Perform Better? , 2019, 2019 13th International Symposium on Medical Information and Communication Technology (ISMICT).

[48]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[49]  Bo Dai,et al.  Contrastive Learning for Image Captioning , 2017, NIPS.

[50]  Yu Zhang,et al.  What is and What is Not a Salient Object? Learning Salient Object Detector by Ensembling Linear Exemplar Regressors , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Huchuan Lu,et al.  Learning to Detect Salient Objects with Image-Level Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[53]  Li Xu,et al.  Hierarchical Image Saliency Detection on Extended CSSD , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  M. Engel,et al.  Early evolution and ecology of camouflage in insects , 2012, Proceedings of the National Academy of Sciences.

[55]  T. Wong,et al.  Camouflage images , 2010, ACM Trans. Graph..