Using citing information to understand the logical structure of document images

Abstract. The most noticeable characteristic of a construction tender document is that its hierarchical architecture is not obviously expressed but is implied in the citing information. Currently available methods cannot deal with such documents. In this paper, the intra-page and inter-page relationships are analyzed in detail. The creation of citing relationships is essential to extracting the logical structure of tender documents. The hierarchy of tender documents naturally leads to extracting and displaying the logical structure as tree structure. This method is successfully implemented in VHTender, and is the key to the efficiency and flexibility of the whole system.

[1]  Silvia Bussi,et al.  A Semi-Automatic Method for Form Layout Description , 1995, ICIAP.

[2]  S. Tsujimoto,et al.  Understanding multi-articled documents , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[3]  Dov Dori,et al.  The representation of document structure: a generic object-process analysis , 1995 .

[4]  Yuan Yan Tang,et al.  Multiresolution analysis in extraction of reference lines from documents with gray level background , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Andreas Dengel,et al.  Message extraction from printed documents-a complete solution , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[6]  A. Peter Johnson,et al.  A Fast Algorithm for Bottom-Up Document Layout Analysis , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Tao Hu,et al.  A Prototype for Extracting Logical Elements from Tables of Contents of journals , 1996, International Workshop on Document Analysis Systems.

[8]  Rung Ching Chen,et al.  Recognition And Data Extraction Of Form Documents Based On Three Types Of Line Segments , 1998, Pattern Recognit..

[9]  Yuan Yan Tang,et al.  Modified Fractal Signature (MFS): A New Approach to Document Analysis for Automatic Knowledge Acquisition , 1997, IEEE Trans. Knowl. Data Eng..

[10]  Seinosuke Narita,et al.  Logical structure analysis of book document images using contents information , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[11]  Debashish Niyogi,et al.  A knowledge-based approach to deriving logical structure from document images , 1995 .