Extraction and restoration of digits touching or overlapping lines

There are many cases that characters written in a form document touch or overlap surrounding lines. In the case of slips, the special table-form documents, digits are written in specified regions and they are often contact with lines of the regions. This problem interferes with digit extraction and therefore drops system performance. This paper proposes a new method that correctly extracts and properly restores digits touching or overlapping lines. The new method extracts pre-digits from input data and divides each of them into a digit-component and a line-component. And then, it decomposes a line-component into type A restoration parts, type B restoration parts, non-restoration parts, and candidate generation parts. Finally it restores each part with different methods.

[1]  D. Guillevic,et al.  Cursive script recognition: A fast reader scheme , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[2]  Osamu Hori,et al.  Robust table-form structure analysis based on box-driven reasoning , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[3]  Sargur N. Srihari,et al.  Analysis of Form Images , 1994, Int. J. Pattern Recognit. Artif. Intell..

[4]  Ching Y. Suen,et al.  Historical review of OCR research and development , 1992, Proc. IEEE.

[5]  Azriel Rosenfeld,et al.  The processing of form documents , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[6]  Yoshinobu Hotta,et al.  Global interpolation in the segmentation of handwritten characters overlapping a border , 1994, Proceedings of 1st International Conference on Image Processing.