A Hybrid Binarization Technique for Document Images

In this chapter, a binarization technique specifically designed for historical document images is presented. Existing binarization techniques focus either on finding an appropriate global threshold or adapting a local threshold for each area in order to remove smear, strains, uneven illumination etc. Here, a hybrid approach is presented that first applies a global thresholding technique and, then, identifies the image areas that are more likely to still contain noise. Each of these areas is re-processed separately to achieve better quality of binarization. Evaluation results are presented that compare our technique with existing ones and indicate that the proposed approach is effective, combining the advantages of global and local thresholding. Finally, future directions of our research are mentioned.

[1]  Olga Veksler,et al.  Fast variable window for stereo correspondence using integral images , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[2]  B. Kapralos,et al.  I An Introduction to Digital Image Processing , 2022 .

[3]  Wayne Nilback An introduction to digital image processing , 1985 .

[4]  Nikos Papamarkos,et al.  An Evaluation Technique for Binarization Algorithms , 2008, J. Univers. Comput. Sci..

[5]  Ergina Kavallieratou A binarization algorithm specialized on document images and photos , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[6]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[7]  Wilson S. Geisler,et al.  Image quality assessment based on a degradation model , 2000, IEEE Trans. Image Process..

[8]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[9]  Venu Govindaraju,et al.  Separating text and background in degraded document images - a comparison of global thresholding techniques for multi-stage thresholding , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[10]  Jean Camillerapp,et al.  Making handwritten archives documents accessible to public with a generic system of document image analysis , 2004, First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings..

[11]  Alan C. Bovik,et al.  Modeling and quality assessment of halftoning by error diffusion , 2000, IEEE Trans. Image Process..

[12]  Henry S. Baird Difficult and urgent open problems in document image analysis for libraries , 2004, First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings..

[13]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[14]  Linda G. Shapiro,et al.  Computer Vision , 2001 .

[15]  Lisa M. Brown,et al.  A survey of image registration techniques , 1992, CSUR.

[16]  Azriel Rosenfeld,et al.  Computer Vision , 1988, Adv. Comput..

[17]  Francesca Cesarini,et al.  A general system for the retrieval of document images from digital libraries , 2004, First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings..