Scene Text Detection Based on Robust Stroke Width Transform and Deep Belief Network

Text detection in natural scene images is an open and challenging problem due to the significant variations of the appearance of the text itself and its interaction with the context. In this paper, we present a novel text detection method combining two main ingredients: the robust extension of Stroke Width Transform (SWT) and the Deep Belief Network (DBN) based discrimination of text objects from other scene components. In the former, smoothness-based edge information is combined with gradient for generating high quality edge images, and various edge cues are exploited in Connected Component (CC) analysis on basis of SWT to eliminate inter-character and intra-character errors. In the latter, DBN is exploited for learning efficient representations discriminating character and non-character CCs, resulting in the improved detection accuracy. The proposed method is evaluated on ICDAR and SVT public datasets and achieves the state-of-the-art results, which reveal the effectiveness of the method.

[1]  Jiri Matas,et al.  A Method for Text Localization and Recognition in Real-World Images , 2010, ACCV.

[2]  Hyung Il Koo,et al.  Scene Text Detection via Connected Component Clustering and Nontext Filtering , 2013, IEEE Transactions on Image Processing.

[3]  Andreas Dengel,et al.  ICDAR 2011 Robust Reading Competition Challenge 2: Reading Text in Scene Images , 2011, 2011 International Conference on Document Analysis and Recognition.

[4]  Rasmus Berg Palm,et al.  Prediction as a candidate for learning deep hierarchical models of data , 2012 .

[5]  Alan L. Yuille,et al.  Detecting and reading text in natural scenes , 2004, CVPR 2004.

[6]  Nizar Bouguila,et al.  Image Text Detection Using a Bandlet-Based Edge Detector and Stroke Width Transform , 2012, BMVC.

[7]  Jorge Stolfi,et al.  Text detection and recognition in urban scenes , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[8]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Jon Almazán,et al.  ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[10]  Jiri Matas,et al.  Scene Text Localization and Recognition with Oriented Stroke Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[12]  Kai Wang,et al.  End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[13]  Simon M. Lucas,et al.  ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[14]  Yonghong Song,et al.  Natural Scene Text Detection with Multi-channel Connected Component Segmentation , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[15]  C. V. Jawahar,et al.  Top-down and bottom-up cues for scene text recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Kaizhu Huang,et al.  Robust Text Detection in Natural Scene Images , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Kai Wang,et al.  Word Spotting in the Wild , 2010, ECCV.

[18]  S.M. Lucas,et al.  ICDAR 2005 text locating competition results , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[19]  Jing Zhang,et al.  A Novel Text Detection System Based on Character and Link Energies , 2014, IEEE Transactions on Image Processing.

[20]  Jiřı́ Matas,et al.  Real-time scene text localization and recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Yingli Tian,et al.  Text Detection in Natural Scene Images by Stroke Gabor Words , 2011, 2011 International Conference on Document Analysis and Recognition.