论文信息 - Comparing CAM Algorithms for the Identification of Salient Image Features in Iconography Artwork Analysis

Comparing CAM Algorithms for the Identification of Salient Image Features in Iconography Artwork Analysis

Iconography studies the visual content of artworks by considering the themes portrayed in them and their representation. Computer Vision has been used to identify iconographic subjects in paintings and Convolutional Neural Networks enabled the effective classification of characters in Christian art paintings. However, it still has to be demonstrated if the classification results obtained by CNNs rely on the same iconographic properties that human experts exploit when studying iconography and if the architecture of a classifier trained on whole artwork images can be exploited to support the much harder task of object detection. A suitable approach for exposing the process of classification by neural models relies on Class Activation Maps, which emphasize the areas of an image contributing the most to the classification. This work compares state-of-the-art algorithms (CAM, Grad-CAM, Grad-CAM++, and Smooth Grad-CAM++) in terms of their capacity of identifying the iconographic attributes that determine the classification of characters in Christian art paintings. Quantitative and qualitative analyses show that Grad-CAM, Grad-CAM++, and Smooth Grad-CAM++ have similar performances while CAM has lower efficacy. Smooth Grad-CAM++ isolates multiple disconnected image regions that identify small iconographic symbols well. Grad-CAM produces wider and more contiguous areas that cover large iconographic symbols better. The salient image areas computed by the CAM algorithms have been used to estimate object-level bounding boxes and a quantitative analysis shows that the boxes estimated with Grad-CAM reach 55% average IoU, 61% GT-known localization and 31% mAP. The obtained results are a step towards the computer-aided study of the variations of iconographic elements positioning and mutual relations in artworks and open the way to the automatic creation of bounding boxes for training detectors of iconographic symbols in Christian art images.

[1] Y. Gousseau,et al. An analysis of the transfer learning of convolutional neural networks for artistic images , 2020, ICPR Workshops.

[2] Franco Turini,et al. A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[3] Walter Daelemans,et al. Multi-modal Label Retrieval for the Visual Arts: The Case of Iconclass , 2021, ICAART.

[4] Lior Shamir,et al. Computer analysis of art , 2012, JOCCH.

[5] Adrián Carballal,et al. Artificial Neural Networks and Deep Learning in the Visual Arts: a review , 2021, Neural Computing and Applications.

[6] Helene E. Roberts. Encyclopedia of Comparative Iconography : Themes Depicted in Works of Art , 2013 .

[7] Gong Cheng,et al. Weakly Supervised Object Localization and Detection: A Survey , 2021, IEEE transactions on pattern analysis and machine intelligence.

[8] Saïd Ladjal,et al. Weakly Supervised Object Detection in Artworks , 2018, ECCV Workshops.

[9] Piero Fraternali,et al. A Dataset and a Convolutional Model for Iconography Classification in Paintings , 2020, ACM Journal on Computing and Cultural Heritage.

[10] Giovanna Castellano,et al. Deep learning approaches to pattern extraction and recognition in paintings and drawings: an overview , 2021, Neural Computing and Applications.

[11] Wei Jiang,et al. Compare the performance of the models in art classification , 2021, PloS one.

[12] Edward Charles Metzger,et al. Tudor Royal Iconography: Literature and Art in an Age of Religious Crisis , 1990 .

[13] Yong Zhou,et al. A survey of semi- and weakly supervised semantic segmentation of images , 2019, Artificial Intelligence Review.

[14] Qingquan Li,et al. Adaptive Sparse Representation for Analyzing Artistic Style of Paintings , 2015, JOCCH.

[15] Alexei A. Efros,et al. Discovering Visual Patterns in Art Collections With Spatially-Consistent Feature Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Suo Qiu,et al. Global Weighted Average Pooling Bridges Pixel-level Localization and Image-level Classification , 2018, ArXiv.

[17] Yong Jae Lee,et al. Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18] Michael Felsberg,et al. Painting-91: a large scale database for computational painting categorization , 2014, Machine Vision and Applications.

[19] Abhishek Das,et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[20] Michael Arens,et al. Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey , 2019, Mach. Learn. Knowl. Extr..

[21] Hongping Cai,et al. The Cross-Depiction Problem: Computer Vision Algorithms for Recognising Objects in Artwork and in Photographs , 2015, ArXiv.

[22] L. D. Couprie. Iconclass: an iconographic classification system , 1983 .

[23] Andrew Zisserman,et al. Of Gods and Goats: Weakly Supervised Learning of Figurative Art , 2013, BMVC.

[24] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[25] Seungchul Lee,et al. Vision-Based Fault Diagnostics Using Explainable Deep Learning With Class Activation Maps , 2020, IEEE Access.

[26] Hyunjung Shim,et al. Attention-Based Dropout Layer for Weakly Supervised Object Localization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Zijian Zhang,et al. Score-CAM: Improved Visual Explanations Via Score-Weighted Class Activation Mapping , 2019, ArXiv.

[28] Daniel Omeiza,et al. Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models , 2019, ArXiv.

[29] R. Ostman. Handbook of Visual Analysis , 2002 .

[30] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Qiang Chen,et al. Network In Network , 2013, ICLR.

[32] Anne McClanan,et al. Reconstructing the Reality of Images: Byzantine Material Culture and Religious Iconography (11th-15th Centuries) , 2006 .

[33] Yan Kang,et al. Picasso, Matisse, or a Fake? Automated Analysis of Drawings at the Stroke Level for Attribution and Authentication , 2017, AAAI.

[34] Marcel Worring,et al. OmniArt: Multi-task Deep Learning for Artistic Data Analysis , 2017, ArXiv.

[35] Rui Jiang,et al. Respond-CAM: Analyzing Deep Models for 3D Imaging Data by Visualizations , 2018, MICCAI.

[36] Sebti Foufou,et al. Study and Evaluation of Pre-trained CNN Networks for Cultural Heritage Image Classification , 2021 .

[37] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Gunhee Kim,et al. Rethinking Class Activation Mapping for Weakly Supervised Object Localization , 2020, ECCV.

[39] Andrew Zisserman,et al. The State of the Art: Object Retrieval in Paintings using Discriminative Regions , 2014, BMVC.

[40] Richard D. White,et al. Using Transfer Learning and Class Activation Maps Supporting Detection and Localization of Femoral Fractures on Anteroposterior Radiographs , 2020, 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI).

[41] Paolo Napoletano,et al. Multitask Painting Categorization by Deep Multibranch Neural Network , 2018, Expert Syst. Appl..

[42] Cosku Kasnakoglu,et al. Painter Prediction from Artworks with Transfer Learning , 2021, 2021 7th International Conference on Mechatronics and Robotics Engineering (ICMRE).

[43] Vinay P. Namboodiri,et al. U-CAM: Visual Explanation Using Uncertainty Based Class Activation Maps , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[44] Kevin J. Vaughn. :A Sourcebook of Nasca Ceramic Iconography: Reading a Culture through Its Art , 2007 .

[45] Changick Kim,et al. Combinational Class Activation Maps for Weakly Supervised Object Localization , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[46] Trevor Darrell,et al. Recognizing Image Style , 2013, BMVC.

[47] James She,et al. DeepArt: Learning Joint Representations of Visual Arts , 2017, ACM Multimedia.

[48] Bryan Pardo,et al. Classifying paintings by artistic genre: An analysis of features & classifiers , 2009, 2009 IEEE International Workshop on Multimedia Signal Processing.

[49] Vineeth N. Balasubramanian,et al. Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[50] Erwin Panofsky,et al. Studies In Iconology: Humanistic Themes In The Art Of The Renaissance , 2019 .

[51] Fernando Lanzi,et al. Saints and their Symbols: Recognizing Saints in Art and in Popular Images , 2004 .

[52] Sebastian Risi,et al. Improving Object Detection in Art Images Using Only Style Transfer , 2021, IEEE International Joint Conference on Neural Network.

[53] Yann Gousseau,et al. Multiple instance learning on deep features for weakly supervised object detection with extreme domain shifts , 2020, Comput. Vis. Image Underst..