SnooperText: A text detection system for automatic indexing of urban scenes

We describe SnooperText, an original detector for textual information embedded in photos of building facades (such as names of stores, products and services) that we developed for the iTowns urban geographic information project. SnooperText locates candidate characters by using toggle-mapping image segmentation and character/non-character classification based on shape descriptors. The candidate characters are then grouped to form either candidate words or candidate text lines. These candidate regions are then validated by a text/non-text classifier using a HOG-based descriptor specifically tuned to single-line text regions. These operations are applied at multiple image scales in order to suppress irrelevant detail in character shapes and to avoid the use of overly large kernels in the segmentation. We show that SnooperText outperforms other published state-of-the-art text detection algorithms on standard image benchmarks. We also describe two metrics to evaluate the end-to-end performance of text extraction systems, and show that the use of SnooperText as a pre-filter significantly improves the performance of a general-purpose OCR algorithm when applied to photos of urban scenes.

[1]  Murray J. J. Holt,et al.  Recognition of Off-Line Cursive Handwriting , 1998, Comput. Vis. Image Underst..

[2]  Beatriz Marcotegui,et al.  Scene text localization based on the ultimate opening , 2007, ISMM.

[3]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Huizhong Chen,et al.  Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions , 2011, 2011 18th IEEE International Conference on Image Processing.

[5]  Matthieu Cord,et al.  Text segmentation in natural scenes using Toggle-Mapping , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[6]  Cheng-Lin Liu,et al.  A Hybrid Approach to Detect and Localize Texts in Natural Scene Images , 2011, IEEE Transactions on Image Processing.

[7]  Matthieu Cord,et al.  TEXT EXTRACTION FROM STREET LEVEL IMAGES , 2009 .

[8]  Yingli Tian,et al.  Localizing Text in Scene Images by Boundary Clustering, Stroke Segmentation, and String Fragment Classification , 2012, IEEE Transactions on Image Processing.

[9]  Azriel Rosenfeld,et al.  A Pyramid Framework for Early Vision: Multiresolutional Computer Vision , 1993 .

[10]  Alan L. Yuille,et al.  Detecting and reading text in natural scenes , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[11]  Cheng-Lin Liu,et al.  A Robust System to Detect and Localize Texts in Natural Scene Images , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[12]  David S. Doermann,et al.  Camera-based analysis of text and documents: a survey , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[13]  Ioannis Pratikakis,et al.  A two-stage scheme for text detection in video images , 2010, Image Vis. Comput..

[14]  C. V. Jawahar,et al.  Top-down and bottom-up cues for scene text recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Jorge Stolfi,et al.  T-HOG: An effective gradient-based descriptor for single line text regions , 2013, Pattern Recognit..

[16]  Jorge Stolfi,et al.  Snoopertrack: Text detection and tracking for outdoor videos , 2011, 2011 18th IEEE International Conference on Image Processing.

[17]  Wen Gao,et al.  Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..

[18]  Bernard Gosselin,et al.  Color text extraction with selective metric-based clustering , 2007, Comput. Vis. Image Underst..

[19]  Jorge Stolfi,et al.  Text detection and recognition in urban scenes , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[20]  AnguelovDragomir,et al.  Google Street View , 2010 .

[21]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Matthieu Cord,et al.  Snoopertext: A multiresolution system for text detection in complex visual scenes , 2010, 2010 IEEE International Conference on Image Processing.

[24]  Jean Serra,et al.  Image Analysis and Mathematical Morphology , 1983 .

[25]  Nicolas Thome,et al.  A cognitive and video-based approach for multinational License Plate Recognition , 2010, Machine Vision and Applications.

[26]  Jiřı́ Matas,et al.  Real-time scene text localization and recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Kai Wang,et al.  End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[28]  Matthieu Cord,et al.  Text detection in street level images , 2013, Pattern Analysis and Applications.

[29]  Lionel Prevost,et al.  2009 10th International Conference on Document Analysis and Recognition Text Detection and Localization in Complex Scene Images using Constrained AdaBoost Algorithm , 2022 .

[30]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[31]  Qixiang Ye,et al.  Combined feature evaluation for adaptive visual object tracking , 2011, Comput. Vis. Image Underst..

[32]  S.M. Lucas,et al.  ICDAR 2005 text locating competition results , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[33]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[34]  Sargur N. Srihari,et al.  Postal address block location in real time , 1992, Computer.

[35]  Zhuowen Tu,et al.  Detecting Texts of Arbitrary Orientations in 1 Natural Images , 2012 .

[36]  Lei Huang,et al.  A New Block Partitioned Text Feature for Text Verification , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[37]  Hyeran Byun,et al.  Scene text extraction in natural scene images using hierarchical feature combining and verification , 2004, ICPR 2004.

[38]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .