Image Enhancement of Historical Documents Using Directional Wavelet

This paper proposes a novel algorithm to clean up a large collection of historical handwritten documents kept in the National Archives of Singapore. Due to the seepage of ink over long period of storage, the front page of each document has been severely marred by the reverse side writing. Earlier attempts have been made to match both sides of a page to identify the offending strokes originating from the back so as to eliminate them with the aid of a wavelet transform. Perfect matching, however, is difficult due to document skews, differing resolutions, inadvertently missing out reverse side and warped pages during image capture. A new approach is now proposed to do away with double side mapping by using a directional wavelet transform that is able to distinguish the foreground and reverse side strokes much better than the conventional wavelet transform. Experiments have shown that the method indeed enhances the readability of each document significantly after the directional wavelet operation without the need for mapping with its reverse side.

[1]  Chew Lim Tan,et al.  Matching of double-sided document images to remove interference , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[2]  Chew Lim Tan,et al.  Restoration of Archival Documents Using a Wavelet Technique , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Ki-Sang Hong,et al.  Binarization of noisy gray-scale character images by thin line modeling , 1999, Pattern Recognit..

[4]  B. Kapralos,et al.  I An Introduction to Digital Image Processing , 2022 .

[5]  Majid Ahmadi,et al.  A Morphological Approach to Text String Extraction from Regular Periodic Overlapping Text/Background Images , 1994, CVGIP Graph. Model. Image Process..

[6]  Moon-Soo Chang,et al.  Improved binarization algorithm for document image by histogram and edge detection , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[7]  Toyohide Watanabe,et al.  Character extraction from noisy background for an automatic reference system , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[8]  Hon-Son Don,et al.  A noise attribute thresholding method for document image binarization , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[9]  S. Mallat A wavelet tour of signal processing , 1998 .

[10]  Rainer Hoch,et al.  On the evaluation of document analysis components by recall, precision, and accuracy , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[11]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .