Efficient implementation of local adaptive thresholding techniques using integral images

Adaptive binarization is an important first step in many document analysis and OCR processes. This paper describes a fast adaptive binarization algorithm that yields the same quality of binarization as the Sauvola method,1 but runs in time close to that of global thresholding methods (like Otsu's method2), independent of the window size. The algorithm combines the statistical constraints of Sauvola's method with integral images.3 Testing on the UW-1 dataset demonstrates a 20-fold speedup compared to the original Sauvola algorithm.

[1]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[2]  J. M. White,et al.  Image Thresholding for Optical Character Recognition and Other Applications Requiring Character Image Extraction , 1983, IBM J. Res. Dev..

[3]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[4]  André Marion,et al.  Introduction to Image Processing , 1990, Springer US.

[5]  Lawrence O'Gorman Binarization and Multithresholding of Document Images Using Connectivity , 1994, CVGIP Graph. Model. Image Process..

[6]  Øivind Due Trier,et al.  Evaluation of Binarization Methods for Document Images , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[8]  Horst Bunke,et al.  Text extraction from colored book and journal covers , 2000, International Journal on Document Analysis and Recognition.

[9]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[10]  Hsi-Jian Lee,et al.  Binarization of color document images via luminance and saturation color features , 2002, IEEE Trans. Image Process..

[11]  Bülent Sankur,et al.  Survey over image thresholding techniques and quantitative performance evaluation , 2004, J. Electronic Imaging.

[12]  Nikos Papamarkos,et al.  Automatic Evaluation of Document Binarization Results , 2005, CIARP.

[13]  Fatih Murat Porikli,et al.  Fast Construction of Covariance Matrices for Arbitrary Size Image Windows , 2006, 2006 International Conference on Image Processing.

[14]  Nikos A. Nikolaou,et al.  Text binarization in color documents , 2006, Int. J. Imaging Syst. Technol..

[15]  Thomas M. Breuel,et al.  Performance Comparison of Six Algorithms for Page Segmentation , 2006, Document Analysis Systems.

[16]  Derek Bradley,et al.  Adaptive Thresholding using the Integral Image , 2007, J. Graph. Tools.

[17]  Thomas M. Breuel,et al.  Page Frame Detection for Marginal Noise Removal from Scanned Documents , 2007, SCIA.

[18]  Stefano Messelodi,et al.  Geometric Layout Analysis Techniques for Document Image Understanding: a Review , 2008 .