Thesholding and Character Recognition in Security Documents with Watermarked Background

Converting scanned gray-level images into binary format, while retaining the foreground and removing the background is a very important step in document image analysis. An application is in processing security documents, such as identification cards, passports and residency cards, which contain watermarks and special hidden features. Bi-level thresholding of document images with non-uniform illumination, complex background patterns and non-uniformly distributed backgrounds is needed for the recognition of characters from such images. A local thresholding technique based on MLP NN previously developed by the author and others was modified and used in removing the background and watermarks found in security documents. The results of thresholding are then passed into an OCR system to recognize the text in the document image. The integration of the modified NN thresholding technique with the proposed OCR system provides 98.3% character recognition rate compared to existing thresholding techniques used in commercial OCR software, such as the Abby Fine Reader.

[1]  Sing-Tze Bow,et al.  Pattern recognition and image preprocessing , 1992 .

[2]  Anil K. Jain,et al.  Goal-Directed Evaluation of Binarization Methods , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Friedrich M. Wahl,et al.  Block segmentation and text extraction in mixed text/image documents , 1982, Comput. Graph. Image Process..

[4]  Andy C. Downton,et al.  A comparison of binarization methods for historical archive documents , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[5]  Ashraf A. Kassim,et al.  Character recognition using statistical moments , 1999, Image Vis. Comput..

[6]  P.K Sahoo,et al.  A survey of thresholding techniques , 1988, Comput. Vis. Graph. Image Process..

[7]  M. Ahmadi,et al.  Local thresholding of composite documents using multi-layer perceptron neural network , 2004, The 2004 47th Midwest Symposium on Circuits and Systems, 2004. MWSCAS '04..

[8]  Yan Solihin,et al.  Integral Ratio: A New Class of Global Thresholding Techniques for Handwriting Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Paul L. Rosin,et al.  Evaluation of global image thresholding for change detection , 2003, Pattern Recognit. Lett..

[10]  Rasit Köker,et al.  NEURAL NETWORK BASED AUTOMATIC THRESHOLD SELECTION FOR AN INDUSTRIAL VISION SYSTEM , 2003 .

[11]  Nikos Papamarkos,et al.  A technique for fuzzy document binarization , 2001, DocEng '01.