Exploring Confidence Measures for Word Spotting in Heterogeneous Datasets

In recent years, convolutional neural networks (CNNs) took over the field of document analysis and they became the predominant model for word spotting. Especially attribute CNNs, which learn the mapping between a word image and an attribute representation, showed exceptional performances. The drawback of this approach is the overconfidence of neural networks when used out of their training distribution. In this paper, we explore different metrics for quantifying the confidence of a CNN in its predictions, specifically on the retrieval problem of word spotting. With these confidence measures, we limit the inability of a retrieval list to reject certain candidates. We investigate four different approaches that are either based on the network's attribute estimations or make use of a surrogate model. Our approach also aims at answering the question for which part of a dataset the retrieval system gives reliable results. We further show that there exists a direct relation between the proposed confidence measures and the quality of an estimated attribute representation.

[1]  Gernot A. Fink,et al.  Attribute CNNs for word spotting in handwritten documents , 2017, International Journal on Document Analysis and Recognition (IJDAR).

[2]  C. V. Jawahar,et al.  Matching Handwritten Document Images , 2016, ECCV.

[3]  Gernot A. Fink,et al.  A Probabilistic Retrieval Model for Word Spotting Based on Direct Attribute Prediction , 2018, 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[4]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[5]  Konstantinos Zagoris,et al.  ICFHR2016 Handwritten Keyword Spotting Competition (H-KWS 2016) , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[6]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[7]  Basilios Gatos,et al.  A survey of document image word spotting techniques , 2017, Pattern Recognit..

[8]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[9]  Ernest Valveny,et al.  Word Spotting and Recognition with Embedded Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Lior Wolf,et al.  CNN-N-Gram for HandwritingWord Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  R. Manmatha,et al.  Word spotting for historical documents , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[12]  Richard E. Harang,et al.  Principled Uncertainty Estimation for Deep Neural Networks , 2018, ArXiv.

[13]  NikouChristophoros,et al.  A survey of document image word spotting techniques , 2017 .

[14]  Sebastian Sudholt,et al.  Learning Deep Representations for Word Spotting under Weak Supervision , 2017, 2018 13th IAPR International Workshop on Document Analysis Systems (DAS).

[15]  Graham W. Taylor,et al.  Learning Confidence for Out-of-Distribution Detection in Neural Networks , 2018, ArXiv.