Interpreting and representing tabular documents

This paper describes a methodology to interpret the information from telephone company DSX assignment table drawings. Horizontal lines are found using an efficient algorithm that works over the run-length encoded representation of the image. For vertical lines, the image is transposed using an efficient method we developed, and the algorithm for horizontal lines is applied again. Using the information about the lines, the tabular structures are extracted by finding biconnected components on the graph formed by the lines and their intersections. A methodology has also been developed for the representation of end access to the entries inside the tables.

[1]  Rangachar Kasturi,et al.  Structural recognition of tabulated data , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[2]  Vishal Misra,et al.  Detection of Horizontal Lines in Noisy Run Length Encoded Images: The FAST Method , 1995, GREC.

[3]  Toyohide Watanabe,et al.  Layout Recognition of Multi-Kinds of Table-Form Documents , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Theo Pavlidis,et al.  A vectorizer and feature extractor for document recognition , 1986 .

[5]  J. Hopcroft,et al.  Algorithm 447: efficient algorithms for graph manipulation , 1973, CACM.

[6]  Edward A. Green,et al.  Model-Based Analysis of Printed Tables , 1995, GREC.

[7]  David H. Douglas,et al.  ALGORITHMS FOR THE REDUCTION OF THE NUMBER OF POINTS REQUIRED TO REPRESENT A DIGITIZED LINE OR ITS CARICATURE , 1973 .

[8]  Edward A. Green,et al.  Model-based analysis of printed tables , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[9]  Alfred V. Aho,et al.  Data Structures and Algorithms , 1983 .

[10]  Y. Hirayama,et al.  A method for table structure analysis using DP matching , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[11]  Dov Dori,et al.  Orthogonal zig-zag: an efficient method for extracting straight lines from engineering drawings , 1992 .

[12]  J. Hopcroft,et al.  Efficient algorithms for graph manipulation , 1971 .