Scene Text Detection via Integrated Discrimination of Component Appearance and Consensus

In this paper, we propose an approach to scene text detection that leverages both the appearance and consensus of connected components. A component appearance is modeled with an SVM based dictionary classifier and the component consensus is represented with color and spatial layout features. Responses of the dictionary classifier are integrated with the consensus features into a discriminative model, where the importance of features is determined with a text level training procedure. In text detection, hypotheses are generated on component pairs and an iterative extension procedure is used to aggregate hypotheses into text objects. In the detection procedure, the discriminative model is used to perform classification as well as control the extension. Experiments show that the proposed approach reaches the state of the art in both detection accuracy and computational efficiency, and in particularly, it performs best when dealing with low-resolution text in clutter backgrounds.

[1]  Andrew Y. Ng,et al.  Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning , 2011, 2011 International Conference on Document Analysis and Recognition.

[2]  Hyung Il Koo,et al.  Scene Text Detection via Connected Component Clustering and Nontext Filtering , 2013, IEEE Transactions on Image Processing.

[3]  Huizhong Chen,et al.  Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions , 2011, 2011 18th IEEE International Conference on Image Processing.

[4]  Andreas Dengel,et al.  ICDAR 2011 Robust Reading Competition Challenge 2: Reading Text in Scene Images , 2011, 2011 International Conference on Document Analysis and Recognition.

[5]  Cheng-Lin Liu,et al.  A Hybrid Approach to Detect and Localize Texts in Natural Scene Images , 2011, IEEE Transactions on Image Processing.

[6]  Chucai Yi,et al.  Text String Detection From Natural Scenes by Structure-Based Partition and Grouping , 2011, IEEE Transactions on Image Processing.

[7]  Geoffrey E. Hinton,et al.  Learning Generative Texture Models with extended Fields-of-Experts , 2009, BMVC.

[8]  Steve McLaughlin,et al.  Comparative study of textural analysis techniques to characterise tissue from intravascular ultrasound , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[9]  Yingli Tian,et al.  Localizing Text in Scene Images by Boundary Clustering, Stroke Segmentation, and String Fragment Classification , 2012, IEEE Transactions on Image Processing.

[10]  Nizar Bouguila,et al.  Image Text Detection Using a Bandlet-Based Edge Detector and Stroke Width Transform , 2012, BMVC.

[11]  Palaiahnakote Shivakumara,et al.  Text detection in natural scenes using Gradient Vector Flow-Guided symmetry , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[12]  Tao Wang,et al.  End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[13]  David Nistér,et al.  Linear Time Maximally Stable Extremal Regions , 2008, ECCV.

[14]  Majid Mirmehdi,et al.  A Head-Mounted Device for Recognizing Text in Natural Scenes , 2011, CBDAR.

[15]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Jiri Matas,et al.  Text Localization in Real-World Images Using Efficiently Pruned Exhaustive Search , 2011, 2011 International Conference on Document Analysis and Recognition.

[17]  Qixiang Ye,et al.  Human Detection in Images via Piecewise Linear Support Vector Machines , 2013, IEEE Transactions on Image Processing.

[18]  Yuxiao Hu,et al.  Text From Corners: A Novel Approach to Detect Text and Caption in Videos , 2011, IEEE Transactions on Image Processing.

[19]  Jiřı́ Matas,et al.  Real-time scene text localization and recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  David S. Doermann,et al.  Camera-based analysis of text and documents: a survey , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[21]  Wen Gao,et al.  Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..

[22]  Kai Wang,et al.  End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[23]  Christof Koch,et al.  AdaBoost for Text Detection in Natural Scene , 2011, 2011 International Conference on Document Analysis and Recognition.