Morphological preprocessing method to thresholding degraded word images

This paper presents a novel preprocessing method based on mathematical morphology techniques to improve the subsequent thresholding quality of raw degraded word images. The raw degraded word images contain undesirable shapes called critical shadows on the background that cause noise in binary images. This noise constitutes obstacles to posterior segmentation of characters. Direct application of a thresholding method produces inadequate binary versions of these degraded word images. Our preprocessing method called Shadow Location and Lightening (SL*L) adaptively, accurately and without manual fine-tuning of parameters locates these critical shadows on grayscale degraded images using morphological operations, and lightens them before applying eventual thresholding process. In this way, enhanced binary images without unpredictable and inappropriate noise can be provided to subsequent segmentation of characters. Then, adequate binary characters can be segmented and extracted as input data to optical character recognition (OCR) applications saving computational effort and increasing recognition rate. The proposed method is experimentally tested with a set of several raw degraded images extracted from real photos acquired by unsophisticated imaging systems. A qualitative analysis of experimental results led to conclusions that the thresholding result quality was significantly improved with the proposed preprocessing method. Also, a quantitative evaluation using a testing data of 1194 degraded word images showed the essentiality and effectiveness of the proposed preprocessing method to increase segmentation and recognition rates of their characters. Furthermore, an advantage of the proposed method is that Otsu's method as a simple and easily implementable global thresholding technique can be sufficient to reducing computational load.

[1]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[2]  Hiroshi Kawakami,et al.  A New Method for Degraded Color Image Binarization Based on Adaptive Lightning on Grayscale Versions , 2004, IEICE Trans. Inf. Syst..

[3]  Anil K. Jain,et al.  Goal-Directed Evaluation of Binarization Methods , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  George Nagy,et al.  Prototype Extraction and Adaptive OCR , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Mohamed S. Kamel,et al.  Extraction of Binary Character/Graphics Images from Grayscale Document Images , 1993, CVGIP Graph. Model. Image Process..

[6]  S. D. Yanowitz,et al.  A new method for image segmentation , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[7]  Hsi-Jian Lee,et al.  Binarization of color document images via luminance and saturation color features , 2002, IEEE Trans. Image Process..

[8]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[9]  Wan-Chi Siu,et al.  Improved techniques for automatic image segmentation , 2001, IEEE Trans. Circuits Syst. Video Technol..

[10]  H.J.A.M. Heijmans,et al.  Mathematical morphology: a geometrical approach in image processing , 1992 .

[11]  Øivind Due Trier,et al.  Evaluation of Binarization Methods for Document Images , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Bülent Sankur,et al.  The performance evaluation of thresholding algorithms for optical character recognition , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[13]  S. Impedovo,et al.  Optical Character Recognition - a Survey , 1991, Int. J. Pattern Recognit. Artif. Intell..

[14]  Hiroshi Kawakami,et al.  A novel adaptive morphological approach for degraded character image segmentation , 2005, Pattern Recognit..

[15]  Jan Flusser,et al.  Degraded Image Analysis: An Invariant Approach , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Pierre Soille,et al.  Morphological Image Analysis: Principles and Applications , 2003 .

[17]  George Nagy,et al.  Twenty Years of Document Image Analysis in PAMI , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Rafael C. González,et al.  Local Determination of a Moving Contrast Edge , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  V. John Mathews,et al.  Adaptive, quadratic preprocessing of document images for binarization , 1998, IEEE Trans. Image Process..

[20]  Kongqiao Wang,et al.  Character location in scene images from digital camera , 2003, Pattern Recognit..

[21]  Jean Serra,et al.  Image Analysis and Mathematical Morphology , 1983 .

[22]  Yi Lu,et al.  Machine printed character segmentation --; An overview , 1995, Pattern Recognit..

[23]  P.K Sahoo,et al.  A survey of thresholding techniques , 1988, Comput. Vis. Graph. Image Process..

[24]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[25]  Bernard Gosselin,et al.  Color binarization for complex camera-based images , 2005, IS&T/SPIE Electronic Imaging.

[26]  Bülent Sankur,et al.  Survey over image thresholding techniques and quantitative performance evaluation , 2004, J. Electronic Imaging.

[27]  Ioannis Pratikakis,et al.  Adaptive degraded document image binarization , 2006, Pattern Recognit..