Binarization of degraded document images based on contrast enhancement

Because of the different types of document degradation such as uneven illumination, image contrast variation, blur caused by humidity, and bleed-through, degraded document image binarization is still an enormous challenge. This paper presents a new binarization method for degraded document images. The proposed algorithm focuses on the differences of image grayscale contrast in different areas. Quadtree is used to divide areas adaptively. In addition, various contrast enhancements are selected to adjust local grayscale contrast in areas with different contrasts. Finally, the local threshold is regarded as the mean of foreground and background gray values, which are determined by the frequency of the gray values. The proposed algorithm was tested on the datasets from the Document Image Binarization Contest (DIBCO) (DIBCO 2009, H-DIBCO 2010, DIBCO 2011, and H-DIBCO 2012). Compared with five other classical algorithms, the images binarized using the proposed algorithm achieved the highest F-measure and peak signal-to-noise ratio and obtained the highest correct rate of recognition.

[1]  Nikos Papamarkos,et al.  A technique for fuzzy document binarization , 2001, DocEng '01.

[2]  Ahmed S. Abutableb Automatic thresholding of gray-level pictures using two-dimensional entropy , 1989 .

[3]  Heng-Da Cheng,et al.  Fuzzy partition of two-dimensional histogram and its application to thresholding , 1999, Pattern Recognit..

[4]  Josef Kittler,et al.  Minimum error thresholding , 1986, Pattern Recognit..

[5]  Carlos A. B. Mello,et al.  A new thresholding algorithm for document images based on the perception of objects by distance , 2014, Integr. Comput. Aided Eng..

[6]  Rahul Sharma,et al.  Adaptive binarization of severely degraded and non-uniformly illuminated documents , 2014, International Journal on Document Analysis and Recognition (IJDAR).

[7]  Chien-Hsing Chou,et al.  A binarization method with learning-built rules for document images produced by cameras , 2010, Pattern Recognit..

[8]  Luigi Cinque,et al.  Image thresholding using fuzzy entropies , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[9]  Ioannis Pratikakis,et al.  Adaptive degraded document image binarization , 2006, Pattern Recognit..

[10]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[11]  Andrew K. C. Wong,et al.  A new method for gray-level picture thresholding using the entropy of the histogram , 1985, Comput. Vis. Graph. Image Process..

[12]  Shanq-Jang Ruan,et al.  Adaptive thresholding algorithm: Efficient computation technique based on intelligent block detection for degraded document images , 2010, Pattern Recognit..

[13]  Nobuyuki Otsu,et al.  ATlreshold Selection Method fromGray-Level Histograms , 1979 .

[14]  Ioannis Pratikakis,et al.  ICFHR 2012 Competition on Handwritten Document Image Binarization (H-DIBCO 2012) , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[15]  Azriel Rosenfeld,et al.  Digital Picture Processing , 1976 .

[16]  Amar Mitiche,et al.  Review of 'Digital Picture Processing,' 2nd edn. (Rosenfeld, A., and Kak, A.C.; 1982) , 1984, IEEE Transactions on Information Theory.

[17]  Rae-Hong Park,et al.  Document image binarization based on topographic analysis using a water flow model , 2002, Pattern Recognit..

[18]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[19]  M. Valizadeh,et al.  Degraded document image binarization based on combination of two complementary algorithms , 2009, 2009 International Conference on Advances in Computational Tools for Engineering Applications.

[20]  Jiangtao Wen,et al.  A new binarization method for non-uniform illuminated document images , 2013, Pattern Recognit..