A syntactic business form classifier

A classifier is proposed in this paper to extract structural information from business forms. The classifier is built upon existing techniques and takes advantage of the highly structured nature of forms, containing lines, boxes and text. Improvements are made to widen the scope of the classifier to handle some unexpected cases from real images. A syntactic representation is built from the detected features using their positions and lengths. The information recorded in this representation is independent of scale and displacement. A filled in form can then be compared to prerecorded blank forms to see which of these fits the best. Encouraging experimental results have been obtained.

[1]  Yasuto Ishitani,et al.  Document skew detection based on local region complexity , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[2]  Mohamed S. Kamel,et al.  Extraction of Binary Character/Graphics Images from Grayscale Document Images , 1993, CVGIP Graph. Model. Image Process..

[3]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[4]  Horst Bunke STRING MATCHING FOR STRUCTURAL PATTERN RECOGNITION , 1990 .

[5]  Seiji Inokuchi,et al.  A Recognition System for Printed Piano Music Using Musical Knowledge and Constraints , 1992 .

[6]  King-Sun Fu,et al.  Syntactic Pattern Recognition And Applications , 1968 .

[7]  A. Pizano Extracting line features from images of business forms and tables , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol. III. Conference C: Image, Speech and Signal Analysis,.

[8]  Rangachar Kasturi,et al.  Structural recognition of tabulated data , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[9]  S.W. Lam,et al.  Anatomy of a form reader , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[10]  Ching Y. Suen,et al.  Document structures: A survey , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[11]  Arturo Pizano,et al.  A business form recognition system , 1991, [1991] Proceedings The Fifteenth Annual International Computer Software & Applications Conference.

[12]  Theodosios Pavlidis,et al.  Optimal Correspondence of String Subsequences , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Sargur N. Srihari,et al.  Analysis of Form Images , 1994, Int. J. Pattern Recognit. Artif. Intell..

[14]  Jaime López-Krahe,et al.  System for an intelligent office document analysis, recognition and description , 1993, Signal Process..