Improving Scene Text Detection by Scale-Adaptive Segmentation and Weighted CRF Verification

This paper presents a hybrid method for detecting and localizing texts in natural scene images by stroke segmentation, verification and grouping. To improve system performance, novelties on two aspects are proposed: 1) a scale-adaptive segmentation method is designed for extracting stroke candidates, and 2) a CRF model with pair-wise weight by local line fitting is designed for stroke verification. Moreover, color-based text region estimation is used to guide segmentation and verification more accurately. Experimental results on ICDAR 2005 competition dataset show that the proposed approach can detect and localize scene texts with high accuracy, even under noisy and complex backgrounds.

[1]  Xiaobo Jin,et al.  Regularized margin-based conditional log-likelihood loss for prototype learning , 2010, Pattern Recognit..

[2]  Jin Hyung Kim,et al.  Scene Text Extraction with Edge Constraint and Text Collinearity , 2010, 2010 20th International Conference on Pattern Recognition.

[3]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[4]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[5]  Cheng-Lin Liu,et al.  A Hybrid Approach to Detect and Localize Texts in Natural Scene Images , 2011, IEEE Transactions on Image Processing.

[6]  Jiri Matas,et al.  WaldBoost - learning for time constrained sequential detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Shih-Fu Chang,et al.  Learning to Detect Scene Text Using a Higher-Order MRF with Belief Propagation , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[9]  David S. Doermann,et al.  Camera-based analysis of text and documents: a survey , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[10]  William T. Freeman,et al.  On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs , 2001, IEEE Trans. Inf. Theory.

[11]  S.M. Lucas,et al.  ICDAR 2005 text locating competition results , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[12]  Jing Zhang,et al.  Extraction of Text Objects in Video Documents: Recent Progress , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.