A Zero-/Few-Shot Anomaly Classification and Segmentation Method for CVPR 2023 VAND Workshop Challenge Tracks 1&2: 1st Place on Zero-shot AD and 4th Place on Few-shot AD

In this technical report, we briefly introduce our solution for the Zero/Few-shot Track of the Visual Anomaly and Novelty Detection (VAND) 2023 Challenge. For industrial visual inspection, building a single model that can be rapidly adapted to numerous categories without or with only a few normal reference images is a promising research direction. This is primarily because of the vast variety of the product types. For the zero-shot track, we propose a solution based on the CLIP model by adding extra linear layers. These layers are used to map the image features to the joint embedding space, so that they can compare with the text features to generate the anomaly maps. Besides, when the reference images are available, we utilize multiple memory banks to store their features and compare them with the features of the test images during the testing phase. In this challenge, our method achieved first place in the zero-shot track, especially excelling in segmentation with an impressive F1 score improvement of 0.0489 over the second-ranked participant. Furthermore, in the few-shot track, we secured the fourth position overall, with our classification F1 score of 0.8687 ranking first among all participating teams.

[1]  Avinash Ravichandran,et al.  WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Yue Wang,et al.  Multimodal Industrial Anomaly Detection via Hybrid Fusion , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Gabriel Ilharco,et al.  Reproducible Scaling Laws for Contrastive Language-Image Learning , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  O. Dabeer,et al.  SPot-the-Difference Self-Supervised Pre-training for Anomaly Detection and Segmentation , 2022, ECCV.

[5]  Yong Liu,et al.  Omni-Frequency Channel-Selection Representations for Unsupervised Anomaly Detection , 2022, IEEE Transactions on Image Processing.

[6]  C. Steger,et al.  Beyond Dents and Scratches: Logical Constraints in Unsupervised Anomaly Detection and Localization , 2022, International Journal of Computer Vision.

[7]  Ye Zheng,et al.  FastFlow: Unsupervised Anomaly Detection and Localization via 2D Normalizing Flows , 2021, ArXiv.

[8]  D. Skočaj,et al.  DRÆM – A discriminatively trained reconstruction embedding for surface anomaly detection , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  B. Schölkopf,et al.  Towards Total Recall in Industrial Anomaly Detection , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Gian Luca Foresti,et al.  VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization , 2021, 2021 IEEE 30th International Symposium on Industrial Electronics (ISIE).

[11]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[12]  Hamid R. Rabiee,et al.  Multiresolution Knowledge Distillation for Anomaly Detection , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Romaric Audigier,et al.  PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization , 2020, ICPR Workshops.

[14]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[15]  Yedid Hoshen,et al.  Sub-Image Anomaly Detection with Deep Pyramid Correspondences , 2020, ArXiv.

[16]  Carsten Steger,et al.  MVTec AD — A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Seyed-Ahmad Ahmadi,et al.  V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[20]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.