Gabor Filters for Degraded Document Image Binarization

Most of the classical methods for degraded document binarization are based on the pixel gray level intensity or on simple pixel neighborhood information such as mean or variance to compute the binarization threshold. Moreover, these information are extracted from the spatial domain of the document image which are not very discriminative. In this paper, we propose to estimate texture information based on Gabor filters for ancient degraded documents. First, the dominant slant angle of the document image script is computed by using the Fourier transform. Then, this dominant angle is used within a weighted sum of angles in a Gabor filter bank in order to capture more efficiently the document image foreground (text). This information, combined with the variance and the mean extracted respectively from spatial and frequency domains are used for estimating the binarization threshold. Three variants are used for evaluating the performance of Gabor filter bank, which are based on Niblack's, Sauvola's, and Wolf's thresholds. Experimental results conducted on DIBCO Datasets show that the proposed method is more appropriate for poor contrasted documents and ink-bleed through degradations.

[1]  Ioannis Pratikakis,et al.  ICDAR 2009 Document Image Binarization Contest (DIBCO 2009) , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[2]  Anil K. Jain,et al.  Text segmentation using gabor filters for automatic document processing , 1992, Machine Vision and Applications.

[3]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[4]  Shijian Lu,et al.  Binarization of historical document images using the local maximum and minimum , 2010, DAS '10.

[5]  Ioannis Pratikakis,et al.  An Adaptive Binarization Technique for Low Quality Historical Documents , 2004, Document Analysis Systems.

[6]  Ioannis Pratikakis,et al.  H-DIBCO 2010 - Handwritten Document Image Binarization Competition , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[7]  Jiangtao Wen,et al.  A new binarization method for non-uniform illuminated document images , 2013, Pattern Recognit..

[8]  Øivind Due Trier,et al.  Evaluation of Binarization Methods for Document Images , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Véronique Eglin,et al.  Hermite and Gabor transforms for noise reduction and handwriting classification in ancient manuscripts , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[10]  Mohamed Cheriet,et al.  A spatially adaptive statistical method for the binarization of historical manuscripts and degraded document images , 2011, Pattern Recognit..

[11]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[12]  Shijian Lu,et al.  Robust Document Image Binarization Technique for Degraded Document Images , 2013, IEEE Transactions on Image Processing.

[13]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[14]  Mohamed Cheriet,et al.  Ancient degraded document image binarization based on texture features , 2013, 2013 8th International Symposium on Image and Signal Processing and Analysis (ISPA).

[15]  Ioannis Pratikakis,et al.  A combined approach for the binarization of handwritten document images , 2014, Pattern Recognit. Lett..

[16]  Shijian Lu,et al.  Document image binarization using background estimation and stroke edges , 2010, International Journal on Document Analysis and Recognition (IJDAR).

[17]  Jean-Michel Jolion,et al.  Text localization, enhancement and binarization in multimedia documents , 2002, Object recognition supported by user interaction for service robots.

[18]  Ioannis Pratikakis,et al.  ICDAR 2013 Document Image Binarization Contest (DIBCO 2013) , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[19]  Bülent Sankur,et al.  Survey over image thresholding techniques and quantitative performance evaluation , 2004, J. Electronic Imaging.

[20]  Mohamed Cheriet,et al.  AdOtsu: An adaptive and parameterless generalization of Otsu's method for document image binarization , 2012, Pattern Recognit..

[21]  Ioannis Pratikakis,et al.  Adaptive degraded document image binarization , 2006, Pattern Recognit..