A model-based form processing sub-system

This paper presents a model-based form processing sub-system, which consists of a form model database and five modules: (i) form modeling, (ii) form recognition, (iii) form dropout, (iv) form definition tool, and (v) form reconstruction. The form modeling module builds explicit representations of scanned form templates to facilitate form recognition and dropout. It can also assist a user to define various fields on a form. The automatic form recognition eliminates the need for manually sorting input forms. The form dropout module effectively removes pre-printed form content to achieve a high data compression rate and to provide clean data for OCR. Our model-driven form dropout scheme has two major advantages over image-based subtraction methods in both dropout efficiency and quality preservation of filled-in data.

[1]  Jiangying Zhou,et al.  Page segmentation and classification , 1992, CVGIP Graph. Model. Image Process..

[2]  Anil K. Jain,et al.  A Generic System for Form Dropout , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Jianchang Mao,et al.  Form dropout using distance transformation , 1995, Proceedings., International Conference on Image Processing.

[4]  Jian Yu,et al.  The relationship between fragmentable spaces and class ℒ spaces , 1996 .

[5]  Azriel Rosenfeld,et al.  The processing of form documents , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[6]  Sargur N. Srihari,et al.  Analysis of Form Images , 1994, Int. J. Pattern Recognit. Artif. Intell..

[7]  Jianchang Mao,et al.  Automated forms-processing software and services , 1996, IBM J. Res. Dev..