Robust Binarization of Stereo and Monocular Document Images Using Percentile Filter

Camera captured documents can be a difficult case for standard binarization algorithms. These algorithms are specifically tailored to the requirements of scanned documents which in general have uniform illumination and high resolution with negligible geometric artifacts. Contrary to this, camera captured images generally are low resolution, contain non-uniform illumination and also posses geometric artifacts. The most important artifact is the defocused or blurred text which is the result of the limited depth of field of the general purpose hand-held capturing devices. These artifacts could be reduced with controlled capture with a single camera but it is inevitable for the case of stereo document images even with the orthoparallel camera setup. Existing methods for binarization require tuning for the parameters separately both for the left and the right images of a stereo pair. In this paper, an approach for binarization based on the local adaptive background estimation using percentile filter has been presented. The presented approach works reasonably well under the same set of parameters for both left and right images. It also shows competitive results for monocular images in comparison with standard binarization methods.

[1]  B I Justusson,et al.  Median Filtering: Statistical Properties , 1981 .

[2]  Pierre Soille,et al.  On morphological operators based on rank filters , 2002, Pattern Recognit..

[3]  Hsi-Jian Lee,et al.  Binarization of color document images via luminance and saturation color features , 2002, IEEE Trans. Image Process..

[4]  Syed Saqib Bukhari,et al.  Robust stereo matching for document images using parameter selection of text-line extraction , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[5]  Ioannis Pratikakis,et al.  ICDAR 2011 Document Image Binarization Contest (DIBCO 2011) , 2011, 2011 International Conference on Document Analysis and Recognition.

[6]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[7]  Nikos A. Nikolaou,et al.  Text Localization and Binarization in Complex Color Documents , 2007, MLDM Posters.

[8]  Mohamed Cheriet,et al.  A local linear level set method for the binarization of degraded historical document images , 2012, International Journal on Document Analysis and Recognition (IJDAR).

[9]  Syed Saqib Bukhari,et al.  Improvements to Uncalibrated Feature-Based Stereo Matching for Document Images by Using Text-Line Segmentation , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[10]  Thomas M. Breuel,et al.  Efficient implementation of local adaptive thresholding techniques using integral images , 2008, Electronic Imaging.

[11]  Robert P. W. Duin,et al.  Fast percentile filtering , 1986, Pattern Recognit. Lett..

[12]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[13]  Thomas M. Breuel,et al.  The OCRopus open source OCR system , 2008, Electronic Imaging.

[14]  Syed Saqib Bukhari,et al.  Adaptive Binarization of Unconstrained Hand-Held Camera-Captured Document Images , 2009, J. Univers. Comput. Sci..

[15]  Hideaki Kawano,et al.  Text-Color-Independent Binarization for Degraded Document Image Based on MAP-MRF Approach , 2011, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[16]  Horst Bunke,et al.  Text extraction from colored book and journal covers , 2000, International Journal on Document Analysis and Recognition.

[17]  Ioannis Pratikakis,et al.  Adaptive degraded document image binarization , 2006, Pattern Recognit..

[18]  Syed Saqib Bukhari,et al.  Robust stereo correspondence for documents by matching connected components of text-lines with dynamic programming , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).