Global-local-global method for logical structure extraction of form document image

Many methods on form document image analysis have been proposed, but only a few of them have treated the extraction of logical structure. A new method called global-local-global (GLG) method for the logical structure extraction of a form document is proposed in this paper. Its algorithm consists of three phases: global division, local logical structure analysis, and global redivision of the whole document. GLG method emphasizes the global layout structure analysis and has a high accuracy. It is robust for treating with the accidental direct adjacent relationship between two unrelated cells. In addition, a logical structure tree is proposed to represent the logical structure of a form document.

[1]  Kuo-Chin Fan,et al.  Extraction of characters from form documents by feature point clustering , 1995, Pattern Recognit. Lett..

[2]  Anil K. Jain,et al.  A Generic System for Form Dropout , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Toyohide Watanabe,et al.  Layout Recognition of Multi-Kinds of Table-Form Documents , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Pinar Duygulu Sahin,et al.  A heuristic algorithm for hierarchical representation of form documents , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[5]  Osamu Hori,et al.  Robust table-form structure analysis based on box-driven reasoning , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[6]  Zen Chen,et al.  Identification of business forms using relationships between adjacent frames , 1996 .