Segmentation-free word spotting using SIFT

This paper presents a method for word spotting using SIFT. The method performs well for multi-lingual document images and it works without skew correction and without any segmentation process. The algorithm for detecting words in test images is described, including the processes of detecting keypoints, matching them, and clustering matching pairs. Experiments on English and Korean document image collections showed that the proposed method is promising.

[1]  Likforman-SulemLaurence,et al.  Text line segmentation of historical documents: a survey , 2007 .

[2]  Josep Lladós,et al.  Word and Symbol Spotting Using Spatial Organization of Local Descriptors , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[3]  Chew Lim Tan,et al.  A Fast Keyword-Spotting Technique , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[4]  Frank Lebourgeois,et al.  Towards an omnilingual word retrieval system for ancient manuscripts , 2009, Pattern Recognit..

[5]  Josep Lladós,et al.  Browsing Heterogeneous Document Collections by a Segmentation-Free Word Spotting Method , 2011, 2011 International Conference on Document Analysis and Recognition.

[6]  Ching Y. Suen,et al.  WORD SPOTTING TECHNIQUES IN DOCUMENT ANALYSIS AND RETRIEVAL — A COMPREHENSIVE SURVEY , 2009 .

[7]  Laurence Likforman-Sulem,et al.  Text line segmentation of historical documents: a survey , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[8]  LuYue,et al.  Information Retrieval in Document Image Databases , 2004 .

[9]  Frank Lebourgeois,et al.  Text search for medieval manuscript images , 2007, Pattern Recognit..

[10]  Chew Lim Tan,et al.  Information Retrieval in Document Image Databases , 2004, IEEE Trans. Knowl. Data Eng..

[11]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.