Graph-Based Background Suppression for Scene Text Detection

Detecting text in video or natural scene image is quite challenging due to the complex background, various fonts and illumination conditions. The preprocessing period, which suppresses the nontext areas so as to highlight the text areas, is the basis for further text detection. In this paper, a novel graph-based background suppression method for scene text detection is proposed. Considering each pixel as a node in the graph, our approach incorporates pixel-level and context-level features into a graph. Various factors contribute to the unary and pair wise cost function which is optimized via max-flow/min-cut algorithm [16] to get a binary image whose nontext pixels are suppressed so that text pixels are highlighted. Furthermore, the proposed background suppression method could be easily combined with other detection methods to improve the performance. Experimental results on ICDAR 2011 competition dataset show promising performance.

[1]  Michael R. Lyu,et al.  A comprehensive method for multilingual video text detection, localization, and extraction , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  David S. Doermann,et al.  Camera-based analysis of text and documents: a survey , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[4]  S.M. Lucas,et al.  ICDAR 2005 text locating competition results , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[5]  Anil K. Jain,et al.  Locating text in complex color images , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[6]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Alan L. Yuille,et al.  Detecting and reading text in natural scenes , 2004, CVPR 2004.

[8]  Chew Lim Tan,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence, Manuscript Id a Laplacian Approach to Multi-oriented Text Detection in Video , 2022 .

[9]  Marie-Pierre Jolly,et al.  Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[10]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Jing Zhang,et al.  Extraction of Text Objects in Video Documents: Recent Progress , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[13]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Chunheng Wang,et al.  Text detection in images based on unsupervised classification of edge-based features , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[15]  Cheng-Lin Liu,et al.  A Hybrid Approach to Detect and Localize Texts in Natural Scene Images , 2011, IEEE Transactions on Image Processing.

[16]  Hae-Kwang Kim,et al.  Efficient Automatic Text Location Method and Content-Based Indexing and Structuring of Video Database , 1996, J. Vis. Commun. Image Represent..

[17]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[18]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[20]  Hang Joon Kim,et al.  Automatic text detection and removal in video sequences , 2003, Pattern Recognit. Lett..

[21]  D. Alspach A gaussian sum approach to the multi-target identification-tracking problem , 1975, Autom..

[22]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[23]  Wen Gao,et al.  Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..