论文信息 - Cross-Depicted Historical Motif Categorization and Retrieval with Deep Learning

Cross-Depicted Historical Motif Categorization and Retrieval with Deep Learning

In this paper, we tackle the problem of categorizing and identifying cross-depicted historical motifs using recent deep learning techniques, with aim of developing a content-based image retrieval system. As cross-depiction, we understand the problem that the same object can be represented (depicted) in various ways. The objects of interest in this research are watermarks, which are crucial for dating manuscripts. For watermarks, cross-depiction arises due to two reasons: (i) there are many similar representations of the same motif, and (ii) there are several ways of capturing the watermarks, i.e., as the watermarks are not visible on a scan or photograph, the watermarks are typically retrieved via hand tracing, rubbing, or special photographic techniques. This leads to different representations of the same (or similar) objects, making it hard for pattern recognition methods to recognize the watermarks. While this is a simple problem for human experts, computer vision techniques have problems generalizing from the various depiction possibilities. In this paper, we present a study where we use deep neural networks for categorization of watermarks with varying levels of detail. The macro-averaged F1-score on an imbalanced 12 category classification task is 88.3 %, the multi-labelling performance (Jaccard Index) on a 622 label task is 79.5 %. To analyze the usefulness of an image-based system for assisting humanities scholars in cataloguing manuscripts, we also measure the performance of similarity matching on expert-crafted test sets of varying sizes (50 and 1000 watermark samples). A significant outcome is that all relevant results belonging to the same super-class are found by our system (Mean Average Precision of 100%), despite the cross-depicted nature of the motifs. This result has not been achieved in the literature so far.

[1] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[2] Volker Märgner,et al. Wasserzeichendarstellung mit Hilfe der Thermographie , 2005 .

[3] Eamonn J. Keogh,et al. Efficiently Finding Near Duplicate Figures in Archives of Historical Documents , 2012, J. Multim..

[4] Weiguo Fan,et al. Sketch-based image retrieval with deep visual semantic descriptor , 2018, Pattern Recognit..

[5] Gang Wang,et al. Online latent semantic hashing for cross-media retrieval , 2019, Pattern Recognit..

[6] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[7] John Eakins,et al. When Images Work Faster than Words The Integration of Content-Based Image Retrieval with the Northumbria Watermark Archive , 2002 .

[8] Marcel Worring,et al. Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9] Mike Kestemont,et al. Artificial Paleography: Computational Approaches to Identifying Script Types in Medieval Manuscripts , 2017, Speculum.

[10] Alois Haidinger. Datieren mittelalterlicher Handschriften mittels ihrer Wasserzeichen , 2005 .

[11] Rui Hu,et al. A performance evaluation of gradient field HOG descriptor for sketch based image retrieval , 2013, Comput. Vis. Image Underst..

[12] Caroline Petitjean,et al. A scalable pattern spotting system for historical documents , 2016, Pattern Recognit..

[13] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[14] W. H. Ittelson,et al. Experiments in Perception , 1951 .

[15] Hongping Cai,et al. Cross-depiction problem: Recognition and synthesis of photographs and artwork , 2015, Computational Visual Media.

[16] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[17] Joseph Schlecht,et al. Nonverbal Communication in Medieval Illustrations Revisited by Computer Vision and Art History , 2013 .

[18] Paulo E. Rauber,et al. Visualizing the Hidden Activity of Artificial Neural Networks , 2017, IEEE Transactions on Visualization and Computer Graphics.