An improved binarization method using inter- and intra-block features for natural images

Binarization of natural images is important for text location and content-based analysis. In this work, a new adaptive method is introduced. It is able to improve the binarization results on the degraded images, such as the complex background, the non-uniform illumination, the variations of text font, size, color, and line orientation. The presented method contains three main stages. Firstly, original threshold of each pixel is calculated to produce some candidate blocks. Secondly, the new inter- and intra-block features are extracted from the candidates based on the characteristics of text. Finally, each block is scored from 0 to s using the mentioned features. The blocks with low scores are considered as subcomponents of background. After extensive experiments, our method demonstrated superior performance against two well-known techniques on the ICDAR 2005 competition dataset.

[1]  Mohamed Cheriet,et al.  A local linear level set method for the binarization of degraded historical document images , 2012, International Journal on Document Analysis and Recognition (IJDAR).

[2]  Bülent Sankur,et al.  Survey over image thresholding techniques and quantitative performance evaluation , 2004, J. Electronic Imaging.

[3]  P.K Sahoo,et al.  A survey of thresholding techniques , 1988, Comput. Vis. Graph. Image Process..

[4]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[5]  Raúl Rojas,et al.  Transition thresholds and transition operators for binarization and edge detection , 2010, Pattern Recognit..

[6]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[7]  Toru Wakahara,et al.  Binarization of Color Characters in Scene Images Using k-means Clustering and Support Vector Machines , 2010, 2010 20th International Conference on Pattern Recognition.

[8]  Mohamed Cheriet,et al.  A multi-scale framework for adaptive binarization of degraded document images , 2010, Pattern Recognit..

[9]  Ioannis Pratikakis,et al.  Adaptive degraded document image binarization , 2006, Pattern Recognit..

[10]  C. V. Jawahar,et al.  An MRF Model for Binarization of Natural Scene Text , 2011, 2011 International Conference on Document Analysis and Recognition.

[11]  S.M. Lucas,et al.  ICDAR 2005 text locating competition results , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[12]  Ioannis Pratikakis,et al.  ICDAR 2011 Document Image Binarization Contest (DIBCO 2011) , 2011, 2011 International Conference on Document Analysis and Recognition.