AdOtsu: An adaptive and parameterless generalization of Otsu's method for document image binarization

Adaptive binarization methods play a central role in document image processing. In this work, an adaptive and parameterless generalization of Otsu's method is presented. The adaptiveness is obtained by combining grid-based modeling and the estimated background map. The parameterless behavior is achieved by automatically estimating the document parameters, such as the average stroke width and the average line height. The proposed method is extended using a multiscale framework, and has been applied on various datasets, including the DIBCO'09 dataset, with promising results.

[1]  Mohamed Cheriet,et al.  A Variational Approach to Degraded Document Enhancement , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Shanq-Jang Ruan,et al.  Adaptive thresholding algorithm: Efficient computation technique based on intelligent block detection for degraded document images , 2010, Pattern Recognit..

[3]  Laurent Wendling,et al.  Multi-scale binarization of images , 2003, Pattern Recognit. Lett..

[4]  Shijian Lu,et al.  A Self-Training Learning Document Binarization Framework , 2010, 2010 20th International Conference on Pattern Recognition.

[5]  Hamid R. Tizhoosh,et al.  Q(lambda)-Based Image Thresholding. , 2004 .

[6]  Thomas M. Breuel,et al.  Efficient implementation of local adaptive thresholding techniques using integral images , 2008, Electronic Imaging.

[7]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[8]  G. Leedham,et al.  Decompose algorithm for thresholding degraded historical document images , 2005 .

[9]  Matthieu Cord,et al.  Text segmentation in natural scenes using Toggle-Mapping , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[10]  C. J. van Rijsbergen,et al.  Information Retrieval , 1979, Encyclopedia of GIS.

[11]  Anil K. Jain,et al.  Goal-Directed Evaluation of Binarization Methods , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Christophe Collet,et al.  From hyperconnections to hypercomponent tree: Application to document image binarization , 2010 .

[13]  Shijian Lu,et al.  Document image binarization using background estimation and stroke edges , 2010, International Journal on Document Analysis and Recognition (IJDAR).

[14]  Bülent Sankur,et al.  Survey over image thresholding techniques and quantitative performance evaluation , 2004, J. Electronic Imaging.

[15]  Ioannis Pratikakis,et al.  Adaptive degraded document image binarization , 2006, Pattern Recognit..

[16]  Chew Lim Tan,et al.  Binarization of Badly Illuminated Document Images through Shading Estimation and Compensation , 2007 .

[17]  Mohamed Cheriet,et al.  A multi-scale framework for adaptive binarization of degraded document images , 2010, Pattern Recognit..

[18]  Young-Joon Kim,et al.  Direct Extraction of Topographic Features for Gray Scale Character Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Mohamed Cheriet,et al.  A spatially adaptive statistical method for the binarization of historical manuscripts and degraded document images , 2011, Pattern Recognit..

[20]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[21]  Hamid R. Tizhoosh,et al.  Q(/spl Lambda/)-based image thresholding , 2004, First Canadian Conference on Computer and Robot Vision, 2004. Proceedings..

[22]  Ioannis Pratikakis,et al.  An Adaptive Binarization Technique for Low Quality Historical Documents , 2004, Document Analysis Systems.

[23]  Mohamed Cheriet,et al.  A local linear level set method for the binarization of degraded historical document images , 2012, International Journal on Document Analysis and Recognition (IJDAR).

[24]  Derek Bradley,et al.  Adaptive Thresholding using the Integral Image , 2007, J. Graph. Tools.

[25]  Mohamed Cheriet,et al.  EFDM : Restoration of Single-sided Low-quality Document Images , 2008 .

[26]  Ioannis Pratikakis,et al.  Improved document image binarization by using a combination of multiple binarization techniques and adapted edge information , 2008, 2008 19th International Conference on Pattern Recognition.

[27]  Ching Y. Suen,et al.  Stroke-model-based character extraction from gray-level document images , 2001, IEEE Trans. Image Process..

[28]  Ioannis Pratikakis,et al.  ICDAR 2009 Document Image Binarization Contest (DIBCO 2009) , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[29]  Mohamed Cheriet,et al.  Extraction of Handwritten Data From Noisy Gray-Level Images Using A Multiscale Approach , 1999, Int. J. Pattern Recognit. Artif. Intell..

[30]  Mohamed Cheriet,et al.  Markovian clustering for the non-local means image denoising , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[31]  Mohamed Cheriet,et al.  RSLDI: Restoration of single-sided low-quality document images , 2009, Pattern Recognit..

[32]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[33]  Ioannis Pratikakis,et al.  DIBCO 2009: document image binarization contest , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[34]  Richard Gran,et al.  On the Convergence of Random Search Algorithms In Continuous Time with Applications to Adaptive Control , 1970, IEEE Trans. Syst. Man Cybern..

[35]  Hon-Son Don,et al.  A noise attribute thresholding method for document image binarization , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[36]  Venu Govindaraju,et al.  Binarization and cleanup of handwritten text from carbon copy medical form images , 2008, Pattern Recognit..

[37]  Ching Y. Suen,et al.  A recursive thresholding technique for image segmentation , 1998, IEEE Trans. Image Process..

[38]  Yap-Peng Tan,et al.  Contrast adaptive binarization of low quality document images , 2004, IEICE Electron. Express.

[39]  Rafael Dueire Lins,et al.  Assessing algorithms to remove back-to-front interference in documents , 2006, 2006 International Telecommunications Symposium.

[40]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[41]  Changsong Liu,et al.  Gray-scale-image-based character recognition algorithm for low-quality and low-resolution images , 2000, IS&T/SPIE Electronic Imaging.

[42]  Robert Sablatnig,et al.  Higher order MRF for foreground-background separation in multi-spectral images of historical manuscripts , 2010, DAS '10.

[43]  Yahia S. Halabi,et al.  Modeling Adaptive Degraded Document Image Binarization and Optical Character System , 2009 .

[44]  Shijian Lu,et al.  Binarization of historical document images using the local maximum and minimum , 2010, DAS '10.

[45]  Rafael Dueire Lins,et al.  Detailing a Quantitative Method for Assessing Algorithms to Remove Back-to-Front Interference in Documents , 2008, J. Univers. Comput. Sci..

[46]  S. vanDongen Graph Clustering by Flow Simulation , 2000 .