An adaptive logical method for binarization of degraded document images

Abstract This paper describes a modified logical thresholding method for binarization of seriously degraded and very poor quality gray-scale document images. This method can deal with complex signal-dependent noise, variable background intensity caused by nonuniform illumination, shadow, smear or smudge and very low contrast. The output image has no obvious loss of useful information. Firstly, we analyse the clustering and connection characteristics of the character stroke from the run-length histogram for selected image regions and various inhomogeneous gray-scale backgrounds. Then, we propose a modified logical thresholding method to extract the binary image adaptively from the degraded gray-scale document image with complex and inhomogeneous background. It can adjust the size of the local area and logical thresholding level adaptively according to the local run-length histogram and the local gray-scale inhomogeneity. Our method can threshold various poor quality gray-scale document images automatically without need of any prior knowledge of the document image and manual fine-tuning of parameters. It keeps useful information more accurately without overconnected and broken strokes of the characters, and thus, has a wider range of applications compared with other methods.

[1]  Robert M. Haralick,et al.  Automatic multithreshold selection , 1984, Comput. Vis. Graph. Image Process..

[2]  Josef Kittler,et al.  Threshold selection based on a simple image statistic , 1985, Comput. Vis. Graph. Image Process..

[3]  Anil K. Jain,et al.  Segmentation of Document Images , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Øivind Due Trier,et al.  Evaluation of Binarization Methods for Document Images , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  P.K Sahoo,et al.  A survey of thresholding techniques , 1988, Comput. Vis. Graph. Image Process..

[6]  Nikos Papamarkos,et al.  A New Approach for Multilevel Threshold Selection , 1994, CVGIP Graph. Model. Image Process..

[7]  Lawrence O'Gorman Binarization and Multithresholding of Document Images Using Connectivity , 1994, CVGIP Graph. Model. Image Process..

[8]  J. M. White,et al.  Image Thresholding for Optical Character Recognition and Other Applications Requiring Character Image Extraction , 1983, IBM J. Res. Dev..

[9]  Andrew K. C. Wong,et al.  A new method for gray-level picture thresholding using the entropy of the histogram , 1985, Comput. Vis. Graph. Image Process..

[10]  Y. Yasuda,et al.  Data compression for check processing machines , 1980, Proceedings of the IEEE.

[11]  Azriel Rosenfeld,et al.  Threshold Evaluation Techniques , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[12]  Sargur N. Srihari,et al.  Document Image Binarization Based on Texture Features , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  José Manuel Rebordão,et al.  An amplitude segmentation method based on the distribution function of an image , 1984, Comput. Vis. Graph. Image Process..

[14]  Sankar K. Pal,et al.  Grey level thresholding using second-order statistics , 1983, Pattern Recognit. Lett..

[15]  Anil K. Jain,et al.  Goal-Directed Evaluation of Binarization Methods , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  R. Kohler A segmentation system based on thresholding , 1981 .

[17]  J. R. Parker,et al.  Gray Level Thresholding in Badly Illuminated Images , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Mohamed S. Kamel,et al.  Extraction of Binary Character/Graphics Images from Grayscale Document Images , 1993, CVGIP Graph. Model. Image Process..

[19]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[20]  Øivind Due Trier,et al.  Improvement of "integrated function algorithm" for binarization of document images , 1995, Pattern Recognit. Lett..

[21]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[22]  Azriel Rosenfeld,et al.  Some experiments on variable thresholding , 1979, Pattern Recognit..

[23]  Sang Uk Lee,et al.  A comparative performance study of several global thresholding techniques for segmentation , 1990, Comput. Vis. Graph. Image Process..