Review of Different Binarization Approaches on Degraded Document Images

Binarization is used to read text documents automatically by using optical character recognition. It is a very important step to segment foreground text form background images. Binarization processes become a challenging task when it comes to old document images which usually suffer from degradations. The different types of document degradation such as uneven illumination, image contrast variation and bleeding-through, binarization surely become an enormous challenge for all researchers. Binary image representation is the essential format for document analysis. This paper presents comparisons of several image binarization techniques in order to find the best approach for the binarizing document image. Several binarization techniques such as Bernsen, Multiple Thresholding, Deghost, Fuzzy C-Means and Triangle methods have been selected for comparison and applied on H-DIBCO 2013 dataset. According to the image quality assessment (IQA) results, it is obvious to state that the Fuzzy C-Means method is successful and effective compared to other methods. Hence, the implications of this image analysis would give researchers a direction for future research.

[1]  Zhang Huayu,et al.  Binarization of degraded document image based on contrast enhancement , 2016, 2016 35th Chinese Control Conference (CCC).

[2]  Francisco de A. T. de Carvalho,et al.  Fuzzy c-means clustering methods for symbolic interval data , 2007, Pattern Recognit. Lett..

[3]  Er. Jagroop Kaur,et al.  Improved Degraded Document Image Binarization Using Guided Image Filter , 2014 .

[4]  Wan Azani Mustafa,et al.  Illumination and Contrast Correction Strategy using Bilateral Filtering and Binarization Comparison , 2016 .

[5]  Wan Azani Mustafa,et al.  Binarization of Document Images: A Comprehensive Review , 2018, Journal of Physics: Conference Series.

[6]  Aboul Ella Hassanien,et al.  A Novel Hybrid Binarization Technique for Images of Historical Arabic Manuscripts , 2015 .

[7]  Mohamed Cheriet,et al.  Influence of Color-to-Gray Conversion on the Performance of Document Image Binarization: Toward a Novel Optimization Problem , 2015, IEEE Transactions on Image Processing.

[8]  Vijay H. Mankar,et al.  Current status and key issues in image steganography: A survey , 2014, Comput. Sci. Rev..

[9]  Mohamed Cheriet,et al.  A local linear level set method for the binarization of degraded historical document images , 2012, International Journal on Document Analysis and Recognition (IJDAR).

[10]  Wan Azani Mustafa,et al.  Binarization of Document Image Using Optimum Threshold Modification , 2018, Journal of Physics: Conference Series.

[11]  Nikolaos Mitianoudis,et al.  Document image binarization using local features and Gaussian mixture modeling , 2015, Image Vis. Comput..

[12]  Manish Kumar Gupta,et al.  Complex and degraded color document image binarization , 2016, 2016 3rd International Conference on Signal Processing and Integrated Networks (SPIN).

[13]  Chunheng Wang,et al.  Degraded document image binarization using structural symmetry of strokes , 2018, Pattern Recognit..

[14]  Seema Pardhi,et al.  An Improved Binarization Method for Degraded Document , 2017 .

[15]  Made Windu Antara Kesiman,et al.  An initial study on the construction of ground truth binarized images of ancient palm leaf manuscripts , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[16]  Xu Zhang,et al.  An Adaptive Deghosting Method in Neural Network-Based Infrared Detectors Nonuniformity Correction , 2018, Sensors.

[17]  Ganga Holi,et al.  Hybrid binarization technique for degraded document images , 2015, 2015 IEEE International Advance Computing Conference (IACC).

[18]  Youcef Chibani,et al.  Restoration based Contourlet Transform for historical document image binarization , 2014, 2014 International Conference on Multimedia Computing and Systems (ICMCS).

[19]  Mastura Jaafar,et al.  An Improved Sauvola Approach on Document Images Binarization , 2018 .

[20]  Anders Hast,et al.  Automatic Document Image Binarization using Bayesian Optimization , 2017, HIP@ICDAR.

[21]  Wan Azani Mustafa,et al.  Background Correction using Average Filtering and Gradient Based Thresholding , 2016 .

[22]  Nicholas R. Howe,et al.  A Laplacian Energy for Document Binarization , 2011, 2011 International Conference on Document Analysis and Recognition.

[23]  Ioannis Pratikakis,et al.  A combined approach for the binarization of handwritten document images , 2014, Pattern Recognit. Lett..

[24]  Xujun Peng,et al.  Using Convolutional Encoder-Decoder for Document Image Binarization , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[25]  G. Zack,et al.  Automatic measurement of sister chromatid exchange frequency. , 1977, The journal of histochemistry and cytochemistry : official journal of the Histochemistry Society.

[26]  Wan Azani Mustafa,et al.  Document Image Database (2009 - 2012): A Systematic Review , 2018 .

[27]  Rob Aspin,et al.  Digital watermarking: Applicability for developing trust in medical imaging workflows state of the art review , 2018, Comput. Sci. Rev..

[28]  S. P. Godse,et al.  Recovery of badly degraded Document images using Binarization Technique , 2014 .

[29]  Ioannis Pratikakis,et al.  ICFHR 2012 Competition on Handwritten Document Image Binarization (H-DIBCO 2012) , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[30]  Rahul Sharma,et al.  Adaptive binarization of severely degraded and non-uniformly illuminated documents , 2014, International Journal on Document Analysis and Recognition (IJDAR).

[31]  Wan Azani Mustafa,et al.  Conversion of the Retinal Image Using Gray World Technique , 2018 .

[32]  Wan Azani Mustafa,et al.  Combination of Gray-Level and Moment Invariant for Automatic Blood Vessel Detection on Retinal Image , 2017 .