Binarization of Color Character Strings in Scene Images using Deep Neural Network

This paper addresses the problem of binarizing multicolored character strings in scene images with complex backgrounds and heavy image degradations. The proposed method consists of three steps. The first step is combinatorial generation of binarized images via every dichotomization of K clusters obtained by K-means clustering of constituent pixels of an input image in the HSI color space. The second step is classification of each binarized image using deep neural network into two categories: character string and non-character string. The final step is selection of a single binarized image with the highest degree of character string as an optimal binarization result. Experimental results using ICDAR 2003 robust word recognition dataset show that the proposed method achieves a correct binarization rate of 87.4% that is highly competitive with the state of the art of binarization of scene character strings.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Simon M. Lucas,et al.  ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[3]  Josef Kittler,et al.  Minimum error thresholding , 1986, Pattern Recognit..

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Tatiana Novikova,et al.  Image Binarization for End-to-End Text Understanding in Natural Images , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Wakahara Toru,et al.  Binarization of Color Character Strings in Scene Images Using K-means Clustering and Support Vector Machines , 2010 .

[10]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[11]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .