Snoopertext: A multiresolution system for text detection in complex visual scenes

Text detection in natural images remains a very challenging task. For instance, in an urban context, the detection is very difficult due to large variations in terms of shape, size, color, orientation, and the image may be blurred or have irregular illumination, etc. In this paper, we describe a robust and accurate multiresolution approach to detect and classify text regions in such scenarios. Based on generation/validation paradigm, we first segment images to detect character regions with a multiresolution algorithm able to manage large character size variations. The segmented regions are then filtered out using shapebased classification, and neighboring characters are merged to generate text hypotheses. A validation step computes a region signature based on texture analysis to reject false positives. We evaluate our algorithm in two challenging databases, achieving very good results.

[1]  Wumo Pan,et al.  Text detection from natural scene images using topographic maps and sparse representations , 2009, ICIP 2009.

[2]  Nicolas Thome,et al.  A cognitive and video-based approach for multinational License Plate Recognition , 2010, Machine Vision and Applications.

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Kye Kyung Kim,et al.  Scene text extraction in natural scene images using hierarchical feature combining and verification , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[5]  Beatriz Marcotegui,et al.  Scene text localization based on the ultimate opening , 2007, ISMM.

[6]  Matthieu Cord,et al.  Text segmentation in natural scenes using Toggle-Mapping , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[7]  Matthieu Cord,et al.  TEXT EXTRACTION FROM STREET LEVEL IMAGES , 2009 .

[8]  Azriel Rosenfeld,et al.  A Pyramid Framework for Early Vision: Multiresolutional Computer Vision , 1993 .

[9]  Azriel Rosenfeld,et al.  A Pyramid Framework for Early Vision , 1994 .

[10]  Sargur N. Srihari,et al.  Postal address block location in real time , 1992, Computer.

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  S.M. Lucas,et al.  ICDAR 2005 text locating competition results , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[13]  Lionel Prevost,et al.  A cascade detector for text detection in natural scene images , 2008, 2008 19th International Conference on Pattern Recognition.

[14]  Ioannis Pratikakis,et al.  ICDAR 2009 Document Image Binarization Contest (DIBCO 2009) , 2009, 2009 10th International Conference on Document Analysis and Recognition.