Decompose-threshold approach to handwriting extraction in degraded historical document images

Historical documents contain important and interesting information. A number of techniques have previously been proposed for thresholding document images. In this paper a new thresholding structure called the decompose-threshold approach is proposed and compared against some existing global and local algorithms. The proposed approach is a local adaptive analysis method, which uses local feature vectors to find the best approach for thresholding a local area. Appropriate algorithm(s) are selected or combined automatically for specific types of document image under investigation. The original image is recursively broken down into sub-regions using quad-trees until an appropriate thresholding method can be applied to each of the sub-region. The algorithm has been evaluated by testing on 10 historical images obtained from the Library of Congress. Evaluation of the performance using 'recall' value demonstrates that the approach outperforms any existing single methods.

[1]  Yan Solihin,et al.  Integral Ratio: A New Class of Global Thresholding Techniques for Handwriting Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Chew Lim Tan,et al.  Restoration of images scanned from thick bound documents , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[3]  Yan Chen,et al.  Comparison of some thresholding algorithms for text/background segmentation in difficult document images , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[4]  Rainer Hoch,et al.  On the evaluation of document analysis components by recall, precision, and accuracy , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[5]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[6]  Øivind Due Trier,et al.  Evaluation of Binarization Methods for Document Images , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Nobuyuki Otsu,et al.  ATlreshold Selection Method fromGray-Level Histograms , 1979 .

[8]  S. Lam Texture feature extraction using gray level gradient based co-occurence matrices , 1996, 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No.96CH35929).

[9]  S. D. Yanowitz,et al.  A new method for image segmentation , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.