Optimal selection of binarization techniques for the processing of ancient palm leaf manuscripts

Ancient palm leaf manuscripts in Thailand have been preserved by many organizations for the protection and retrieval of traditional knowledge. With advanced computer technology, digitized media is now commonly used to record these documents. One objective of such work is to develop an efficient image processing system that could be used to retrieve knowledge and information automatically from these manuscripts. Binarization is an important stage during preprocessing of the manuscripts for subsequent extraction of text and characters. The output is then used for further processes such as character recognition and knowledge extraction. However, there is no single binarization technique that is suitable for all documents. This study aims to improve the binarization process intelligently by applying machine learning techniques to classify the optimal selection of binarization techniques of palm leaf manuscripts. Experiments results are reported and this technique could be applied for automatic selection of binarization techniques in other image processing problems.

[1]  Wen-Hsiang Tsai,et al.  Moment-preserving thresolding: A new approach , 1985, Comput. Vis. Graph. Image Process..

[2]  Andrew K. C. Wong,et al.  A new method for gray-level picture thresholding using the entropy of the histogram , 1985, Comput. Vis. Graph. Image Process..

[3]  Yan Chen,et al.  Comparison of some thresholding algorithms for text/background segmentation in difficult document images , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[4]  Andy C. Downton,et al.  A comparison of binarization methods for historical archive documents , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[5]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[6]  T. Subba Rao,et al.  Classification, Parameter Estimation and State Estimation: An Engineering Approach Using MATLAB , 2004 .

[7]  Bülent Sankur,et al.  Selection of thresholding methods for nondestructive testing applications , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[8]  Mao-Jiun J. Wang,et al.  Image thresholding by minimizing the measures of fuzzines , 1995, Pattern Recognit..

[9]  Chun Che Fung,et al.  Comparing background elimination approaches for processing of ancient Thai manuscipts on palm leaves , 2009, 2009 International Conference on Machine Learning and Cybernetics.

[10]  Wen-Hsiang Tsai,et al.  Moment-preserving thresholding: a new approach , 1995 .

[11]  G. Leedham,et al.  Decompose algorithm for thresholding degraded historical document images , 2005 .

[12]  Shyang Chang,et al.  A new criterion for automatic multilevel thresholding , 1995, IEEE Trans. Image Process..

[13]  Josef Kittler,et al.  Minimum error thresholding , 1986, Pattern Recognit..

[14]  B. Kapralos,et al.  I An Introduction to Digital Image Processing , 2022 .

[15]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[16]  Anil K. Jain,et al.  Goal-Directed Evaluation of Binarization Methods , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Bülent Sankur,et al.  Survey over image thresholding techniques and quantitative performance evaluation , 2004, J. Electronic Imaging.