Neural Ctrl-F: Segmentation-Free Query-by-String Word Spotting in Handwritten Manuscript Collections

In this paper, we approach the problem of segmentation-free query-by-string word spotting for handwritten documents. In other words, we use methods inspired from computer vision and machine learning to search for words in large collections of digitized manuscripts. In particular, we are interested in historical handwritten texts, which are often far more challenging than modern printed documents. This task is important, as it provides people with a way to quickly find what they are looking for in large collections that are tedious and difficult to read manually. To this end, we introduce an end-to-end trainable model based on deep neural networks that we call Ctrl-F-Net. Given a full manuscript page, the model simultaneously generates region proposals, and embeds these into a distributed word embedding space, where searches are performed. We evaluate the model on common benchmarks for handwritten word spotting, outperforming the previous state-of-the-art segmentation-free approaches by a large margin, and in some cases even segmentation-based approaches. One interesting real-life application of our approach is to help historians to find and count specific words in court records that are related to women's sustenance activities and division of labor. We provide promising preliminary experiments that validate our method on this task.

[1]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[2]  Albert Gordo,et al.  Label Embedding: A Frugal Baseline for Text Recognition , 2015, International Journal of Computer Vision.

[3]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Ernest Valveny,et al.  A Sliding Window Framework for Word Spotting Based on Word Attributes , 2015, IbPRIA.

[6]  Ernest Valveny,et al.  Word Spotting and Recognition with Embedded Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Anders Brun,et al.  Semantic and Verbatim Word Spotting Using Deep Neural Networks , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[8]  Erik Lindberg,et al.  Making verbs count: the research project ‘Gender and Work’ and its methodology , 2011 .

[9]  Tim Causer,et al.  Building A Volunteer Community: Results and Findings from Transcribe Bentham , 2012, Digit. Humanit. Q..

[10]  William J. Turkel,et al.  Rethinking inventories in the digital age: the caseof the Old Bailey , 2014 .

[11]  Gernot A. Fink,et al.  Segmentation-free query-by-string word spotting with Bag-of-Features HMMs , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[12]  Jiri Matas,et al.  Real-Time Lexicon-Free Scene Text Localization and Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[14]  Maria Ågren,et al.  Vad var en hustru? Ett begreppshistoriskt bidrag till genushistorien : [What's in a word? The history of the Swedish female title hustru] , 2014 .

[15]  Ernest Valveny,et al.  Segmentation-free word spotting with exemplar SVMs , 2014, Pattern Recognit..

[16]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[17]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[18]  Gernot A. Fink,et al.  Bag-of-Features HMMs for Segmentation-Free Word Spotting in Handwritten Documents , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[19]  Lior Wolf,et al.  CNN-N-Gram for HandwritingWord Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Lior Wolf,et al.  A Simple and Fast Word Spotting Method , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[21]  Joakim Nivre,et al.  Ranking Relevant Verb Phrases Extracted from Historical Text , 2015, LaTeCH@ACL.

[22]  Li Fei-Fei,et al.  DenseCap: Fully Convolutional Localization Networks for Dense Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[24]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Antti Räihä Kejsarinnas undersåte i stället för donationsbo. Bondeaktivism och rättvisenormer i 1700-talets ryska gränstrakt , 2014 .

[26]  Ernest Valveny,et al.  Query by string word spotting based on character bi-gram indexing , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[27]  C. V. Jawahar,et al.  Deep Feature Embedding for Accurate Recognition and Retrieval of Handwritten Text , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[28]  Sudholt Sebastian,et al.  PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents , 2016 .

[29]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[30]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  Andrew Zisserman,et al.  Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.

[33]  R. Manmatha,et al.  Holistic word recognition for handwritten historical documents , 2004, First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings..

[34]  Leos Müller,et al.  Vad var en hustru? Ett begreppshistoriskt bidrag till genushistorien , 2014 .

[35]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[36]  C. V. Jawahar,et al.  Matching Handwritten Document Images , 2016, ECCV.

[37]  Anders Brun,et al.  A Novel Word Segmentation Method Based on Object Detection and Deep Learning , 2015, ISVC.

[38]  William J. Turkel,et al.  The Old Bailey Proceedings, 1674–1913: Text Mining for Evidence of Court Behavior , 2016, Law and History Review.

[39]  Eva Pettersson,et al.  HistSearch - Implementation and Evaluation of a Web-based Tool for Automatic Information Extraction from Historical Text , 2016, HistoInformatics@DH.