Stroke-model-based character extraction from gray-level document images

Global gray-level thresholding techniques such as Otsu's method, and local gray-level thresholding techniques such as edge-based segmentation or the adaptive thresholding method are powerful in extracting character objects from simple or slowly varying backgrounds. However, they are found to be insufficient when the backgrounds include sharply varying contours or fonts in different sizes. A stroke-model is proposed to depict the local features of character objects as double-edges in a predefined size. This model enables us to detect thin connected components selectively, while ignoring relatively large backgrounds that appear complex. Meanwhile, since the stroke width restriction is fully factored in, the proposed technique can be used to extract characters in predefined font sizes. To process large volumes of documents efficiently, a hybrid method is proposed for character extraction from various backgrounds. Using the measurement of class separability to differentiate images with simple backgrounds from those with complex backgrounds, the hybrid method can process documents with different backgrounds by applying the appropriate methods. Experiments on extracting handwriting from a check image, as well as machine-printed characters from scene images demonstrate the effectiveness of the proposed model.

[1]  Yuan Yan Tang,et al.  Automatic Extraction of Baselines and Data from Check Images , 1997, Int. J. Pattern Recognit. Artif. Intell..

[2]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[3]  Réjean Plamondon,et al.  Extraction of signatures from check background based on a filiformity criterion , 1998, IEEE Trans. Image Process..

[4]  Anil K. Jain,et al.  Locating text in complex color images , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[5]  Scott C. Douglas Running max/min calculation using a pruned ordered list , 1996, IEEE Trans. Signal Process..

[6]  J. R. Parker,et al.  Gray Level Thresholding in Badly Illuminated Images , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  N. B. Venkateswarlu Implementation of some image thresholding algorithms on a connection machine-200 , 1995, Pattern Recognit. Lett..

[8]  Sargur N. Srihari,et al.  Document Image Binarization: Evaluation Of Algorithms , 1986, Optics & Photonics.

[9]  Yves Lecourtier,et al.  Multi-Bank Check Recognition System: Consideration on The Numeral Amount Recognition Module , 1997, Int. J. Pattern Recognit. Artif. Intell..

[10]  Nikolai Gorski,et al.  The A2iA Intercheque System: Courtesy Amount and Legal Amount Recognition for French Checks , 1997, Int. J. Pattern Recognit. Artif. Intell..

[11]  J. Galayda Edge Focusing , 1981, IEEE Transactions on Nuclear Science.

[12]  Theodosios Pavlidis,et al.  Direct Gray-Scale Extraction of Features for Character Recognition , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  D Marr,et al.  Theory of edge detection , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[14]  LiuYing,et al.  Document Image Binarization Based on Texture Features , 1997 .

[15]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[16]  Robert Sabourin,et al.  A multi-resolution based approach for handwriting segmentation in gray-scale images , 1994, Proceedings of 1st International Conference on Image Processing.

[17]  Øivind Due Trier,et al.  Evaluation of Binarization Methods for Document Images , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Daniel P. Lopresti,et al.  Document Analysis and the World Wide Web , 1996, DAS.

[19]  Ching Y. Suen,et al.  A recursive thresholding technique for image segmentation , 1998, IEEE Trans. Image Process..

[20]  Ching Y. Suen,et al.  Extraction of bankcheck items by mathematical morphology , 1999, International Journal on Document Analysis and Recognition.

[21]  Shigeru Akamatsu,et al.  Recognizing Characters in Scene Images , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  J. Wade Davis,et al.  Statistical Pattern Recognition , 2003, Technometrics.

[23]  Jean Serra,et al.  Image Analysis and Mathematical Morphology , 1983 .

[24]  Ching Y. Suen,et al.  Model-based character extraction from complex backgrounds , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[25]  J. M. White,et al.  Image Thresholding for Optical Character Recognition and Other Applications Requiring Character Image Extraction , 1983, IBM J. Res. Dev..

[26]  Mohamed Cheriet,et al.  Extraction of Handwritten Data From Noisy Gray-Level Images Using A Multiscale Approach , 1999, Int. J. Pattern Recognit. Artif. Intell..

[27]  Mohamed S. Kamel,et al.  Extraction of Binary Character/Graphics Images from Grayscale Document Images , 1993, CVGIP Graph. Model. Image Process..

[28]  Adam Krzyzak,et al.  Segmentation of handwritten digits using contour features , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[29]  Il-Seok Oh Document image binarization preserving stroke connectivity , 1995, Pattern Recognit. Lett..

[30]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[31]  Anil K. Jain,et al.  Goal-Directed Evaluation of Binarization Methods , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Hans Hegt,et al.  A high performance license plate recognition system , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).

[33]  Sargur N. Srihari,et al.  Document Image Binarization Based on Texture Features , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Naoki Tanaka,et al.  Robust extraction of characters from color scene image using mathematical morphology , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[35]  Ching Y. Suen,et al.  Bank check processing system , 1996, Int. J. Imaging Syst. Technol..