A Thresholding Algorithm for Text/Background Segmentation in Degraded Handwritten Jawi Documents

The old documents in Jawi script are being used widely for references. The hard copies of those scripts will deteriorate as time passes. Most of the previous works on Jawi documents focused on the character recognition and the accuracy of the algorithm was very much affected by noise. An effective preprocessing method is required to binarize degraded Jawi document. In this paper, a new technique to threshold degraded Jawi document is proposed. The results of the new algorithm were also evaluated and compared with several algorithms. The quality of the thresholding methods was assessed using visual inspection and mathematical evaluation. The results show that the new technique has outclassed other binarization algorithms.

[1]  Mashkuri Yaacob,et al.  Hardware design of on-line Jawi character recognition chip using discrete wavelet transform , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[2]  Yan Chen,et al.  Comparison of some thresholding algorithms for text/background segmentation in difficult document images , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[3]  William A. Yasnoff,et al.  Error measures for scene segmentation , 1977, Pattern Recognit..

[4]  Øivind Due Trier,et al.  Evaluation of Binarization Methods for Document Images , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Sitti Rachmawati Yahya,et al.  Review on Image Enhancement Methods of Old Manuscript with Damaged Background , 2010 .

[6]  Zaidi Razak,et al.  VHDL implementation of JAWI character recognition via chain code algorithm , 2003, International Symposium on Multispectral Image Processing and Pattern Recognition.

[7]  S. D. Yanowitz,et al.  A new method for image segmentation , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[8]  Yan Solihin,et al.  Integral Ratio: A New Class of Global Thresholding Techniques for Handwriting Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[10]  Domingo Mery,et al.  Segmentation in Food Images , 2006 .

[11]  Khairuddin Omar,et al.  Skew Detection and Correction of Jawi Images Using Gradient Direction , 2002 .

[12]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[13]  Rosli Salleh,et al.  Off-Line Handwritten Jawi Character Segmentation Using Histogram Normalization And Sliding Window Approach For Hardware Implementation , 2009 .

[14]  Rosli Salleh,et al.  A Real-time Line Segmentation Algorithm for an Offline Overlapped Handwritten Jawi Character Recognition Chip , 2007 .

[15]  Venu Govindaraju,et al.  Separating text and background in degraded document images - a comparison of global thresholding techniques for multi-stage thresholding , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.