Image-based form document retrieval

We address the problem of image-based form document retrieval. The essential element of this problem is the definition of a similarity measure that is applicable in real situations, where query images are allowed to differ from the database images. Based on the definition of form signature, we have proposed a similarity measure that is insensitive to translation, scaling, moderate skew (<5/spl deg/) and variations in the geometrical proportions of the form layout. This similarity measure also has a good tolerance to line detection errors. We have developed a prototype form retrieval system which has been tested on a database containing 100 different kinds of forms.

[1]  Sargur N. Srihari,et al.  Analysis of Form Images , 1994, Int. J. Pattern Recognit. Artif. Intell..

[2]  Jianchang Mao,et al.  Automated forms-processing software and services , 1996, IBM J. Res. Dev..

[3]  A. Laurentini,et al.  Identifying and understanding tabular material in compound documents , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[4]  Yuan Yan Tang,et al.  Financial document processing based on staff line and description language , 1995, IEEE Trans. Syst. Man Cybern..

[5]  Xiaoqing Ding,et al.  Description and recognition of form and automated form data entry , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[6]  Kuo-Chin Fan,et al.  Extraction of characters from form documents by feature point clustering , 1995, Pattern Recognit. Lett..

[7]  Rung Ching Chen,et al.  Recognition And Data Extraction Of Form Documents Based On Three Types Of Line Segments , 1998, Pattern Recognit..

[8]  Hsi-Jian Lee,et al.  An Efficient Algorithm For Form Structure Extraction Using Strip Projection , 1998, Pattern Recognit..

[9]  Toyohide Watanabe,et al.  Layout Recognition of Multi-Kinds of Table-Form Documents , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Azriel Rosenfeld,et al.  The processing of form documents , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[11]  Jianchang Mao,et al.  A model-based form processing sub-system , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[12]  Anil K. Jain,et al.  A Generic System for Form Dropout , 1996, IEEE Trans. Pattern Anal. Mach. Intell..