Unsupervised Evaluation Methods Based on Local Gray-Intensity Variances for Binarization of Historical Documents

We attempt to evaluate the efficacy of six unsupervised evaluation method to tune Sauvola's threshold in optical character recognition (OCR) applications. We propose local implementations of well-known measures based on gray-intensity variances. Additionally, we derive four new measures from them using the unbiased variance estimator and gray-intensity logarithms. In our experiment, we selected the well binarized images, according each measure, and computed the accuracy of the recognized text of each. The results show that the weighted and uniform variance (using logarithms) are suitable measures for OCR applications.

[1]  Hélène Laurent,et al.  Unsupervised Performance Evaluation of Image Segmentation , 2006, EURASIP J. Adv. Signal Process..

[2]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[3]  Bülent Sankur,et al.  Survey over image thresholding techniques and quantitative performance evaluation , 2004, J. Electronic Imaging.

[4]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[5]  Y. J. Zhang,et al.  A survey on evaluation methods for image segmentation , 1996, Pattern Recognit..

[6]  Nikos Papamarkos,et al.  An Evaluation Technique for Binarization Algorithms , 2008, J. Univers. Comput. Sci..

[7]  Hui Zhang,et al.  Image segmentation evaluation: A survey of unsupervised methods , 2008, Comput. Vis. Image Underst..

[8]  Martin D. Levine,et al.  Dynamic Measurement of Computer Generated Image Segmentations , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Nobuyuki Otsu,et al.  ATlreshold Selection Method fromGray-Level Histograms , 1979 .

[10]  Anil K. Jain,et al.  Goal-Directed Evaluation of Binarization Methods , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Raúl Rojas,et al.  Transition pixel: A concept for binarization based on edge detection and gray-intensity histograms , 2010, Pattern Recognit..

[12]  P.K Sahoo,et al.  A survey of thresholding techniques , 1988, Comput. Vis. Graph. Image Process..