Text Localization in Natural Scene Images Based on Conditional Random Field

This paper proposes a novel hybrid method to robustly and accurately localize texts in natural scene images. A text region detector is designed to generate a text confidence map, based on which text components can be segmented by local binarization approach. A Conditional Random Field (CRF) model, considering the unary component property as well as binary neighboring component relationship, is then presented to label components as "text" or "non-text". Last, text components are grouped into text lines with an energy minimization approach. Experimental results show that the proposed method gives promising performance comparing with the existing methods on ICDAR 2003 competition dataset.

[1]  Biing-Hwang Juang,et al.  Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[2]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  Robert Sedgewick,et al.  Algorithms in c, part 5: graph algorithms, third edition , 2001 .

[4]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[5]  David S. Doermann,et al.  Camera-based analysis of text and documents: a survey , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[6]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[7]  Jiri Matas,et al.  WaldBoost - learning for time constrained sequential detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  B. Kapralos,et al.  I An Introduction to Digital Image Processing , 2022 .

[9]  Masayuki Nakajima,et al.  Region graph based text extraction from outdoor images , 2005, Third International Conference on Information Technology and Applications (ICITA'05).

[10]  S.M. Lucas,et al.  ICDAR 2005 text locating competition results , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[11]  Masatoshi Okutomi,et al.  Distribution-Based Face Detection using Calibrated Boosted Cascade Classifier , 2007, 14th International Conference on Image Analysis and Processing (ICIAP 2007).

[12]  Grouping Text Lines in Online Handwritten Japanese Documents by Combining Temporal and Spatial Information , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[13]  Cheng-Lin Liu,et al.  A Robust System to Detect and Localize Texts in Natural Scene Images , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[14]  Jing Zhang,et al.  Extraction of Text Objects in Video Documents: Recent Progress , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[15]  Fei Yin,et al.  Handwritten Chinese text line segmentation by clustering with distance metric learning , 2009, Pattern Recognit..

[16]  S. Katagiri,et al.  Discriminative Learning for Minimum Error Classification , 2009 .