Iterative model-based binarization algorithm for cheque images

Abstract. Binarization of document images with poor contrast, strong noise, complex patterns, and variable modalities in the gray-scale histograms is a challenging problem. A new binarization algorithm has been developed to address this problem for personal cheque images. The main contribution of this approach is optimizing the binarization of a part of the document image that suffers from noise interference, referred to as the Target Sub-Image (TSI), using information easily extracted from another noise-free part of the same image, referred to as the Model Sub-Image (MSI). Simple spatial features extracted from MSI are used as a model for handwriting strokes. This model captures the underlying characteristics of the writing strokes, and is invariant to the handwriting style or content. This model is then utilized to guide the binarization in the TSI. Another contribution is a new technique for the structural analysis of document images, which we call “Wavelet Partial Reconstruction” (WPR). The algorithm was tested on 4,200 cheque images and the results show significant improvement in binarization quality in comparison with other well-established algorithms.

[1]  Nikolai Gorski,et al.  The A2iA Intercheque System: Courtesy Amount and Legal Amount Recognition for French Checks , 1997, Int. J. Pattern Recognit. Artif. Intell..

[2]  J. R. Parker,et al.  Gray Level Thresholding in Badly Illuminated Images , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  P.K Sahoo,et al.  A survey of thresholding techniques , 1988, Comput. Vis. Graph. Image Process..

[4]  Ching Y. Suen,et al.  A recursive thresholding technique for image segmentation , 1998, IEEE Trans. Image Process..

[5]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[6]  Anil K. Jain,et al.  Goal-Directed Evaluation of Binarization Methods , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Charles L. Wilson,et al.  NIST form-based handprint recognition system (release 2.0) , 1997 .

[8]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  L. J. Bain,et al.  Introduction to Probability and Mathematical Statistics , 1987 .

[10]  Øivind Due Trier,et al.  Improvement of "integrated function algorithm" for binarization of document images , 1995, Pattern Recognit. Lett..

[11]  Sargur N. Srihari,et al.  Document Image Binarization Based on Texture Features , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Christopher R. Dance,et al.  Binarising camera images for OCR , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[13]  Patrick J. Grother,et al.  NIST Form-Based Handprint Recognition System , 1994 .

[14]  Amer Dawoud,et al.  Binarization of document images using image dependent model , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[15]  Hong Yan,et al.  An adaptive thresholding method for binarization of blueprint images , 2000, Pattern Recognit. Lett..

[16]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[17]  S. D. Yanowitz,et al.  A new method for image segmentation , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[18]  J. M. White,et al.  Image Thresholding for Optical Character Recognition and Other Applications Requiring Character Image Extraction , 1983, IBM J. Res. Dev..

[19]  Yan Solihin,et al.  Integral Ratio: A New Class of Global Thresholding Techniques for Handwriting Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..