Text detection in natural and computer-generated images

Text detection is one of the most challenging and commonly dealt applications in computer vision. Detecting text regions is the first step of the text recognition systems called Optical Character Recognition. This process requires the separation of text region from non-text region. In this paper, we utilize Maximally Stable Extremal Regions to acquire very first text region candidates. Then these possible regions are reduced in quantity by using geometric and stroke width properties. Candidate regions are joined to obtain text groups. Finally, Tesseract Optical Character Recognition engine is utilized as the last step to eliminate non-text groups. We evaluated the proposed system on KAIST and ICDAR datasets for both natural images and computer-generated images. For natural images 82.7% precision and 52.0% f-accuracy; for computer-generated images 64.0% precision and 65.2% f-accuracy is achieved.

[1]  Huchuan Lu,et al.  Scene text detection via stroke width , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[2]  Weilin Huang,et al.  Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees , 2014, ECCV.

[3]  Andrew Zisserman,et al.  Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.

[4]  R. Smith,et al.  An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[5]  Luis M. Bergasa,et al.  Location in Complex Images , 2012 .

[6]  C. S. Shin,et al.  Support vector machine-based text detection in digital video , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).

[7]  Andrew Zisserman,et al.  Deep Features for Text Spotting , 2014, ECCV.

[8]  Bernd Freisleben,et al.  Text detection in images based on unsupervised classification of high-frequency wavelet coefficients , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[9]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Chunheng Wang,et al.  Scene text detection using graph model built upon maximally stable extremal regions , 2013, Pattern Recognit. Lett..

[11]  Huizhong Chen,et al.  Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions , 2011, 2011 18th IEEE International Conference on Image Processing.

[12]  Chunheng Wang,et al.  Text detection in images based on unsupervised classification of edge-based features , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[13]  Nobuo Ezaki,et al.  Text detection from natural scene images: towards a system for visually impaired persons , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[14]  Wen Gao,et al.  Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..

[15]  Andrew Y. Ng,et al.  Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning , 2011, 2011 International Conference on Document Analysis and Recognition.

[16]  Ray Smith An Overview of the Tesseract OCR Engine , 2007 .

[17]  Ernest Valveny,et al.  ICDAR 2015 competition on Robust Reading , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[18]  Xu-Cheng Yin,et al.  Robust Text Detection in Natural Scene Images. , 2014, IEEE transactions on pattern analysis and machine intelligence.

[19]  Jean-Marc Odobez,et al.  Text detection, recognition in images and video frames , 2004, Pattern Recognit..

[20]  Hang Joon Kim,et al.  Support vector machine-based text detection in digital video , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).