Structural recognition of tabulated data

A system for extraction of structural information of a table from its image is discussed. Following the initial binarization and deskewing operations, the image is scanned to extract all horizontal and vertical lines that are present. The table's dimensions are estimated based on these lines. Unlike other systems, the procedure described does not depend on the sole existence of lines to mark the item blocks. White streams are recognized in both the horizontal and vertical direction as substitutes for any missing demarcation lines. A structure interpretation procedure uses the extracted demarcation information to identify each of the item blocks in the table. Subsequently, the interrelations of these item blocks are used to recognize the structure of the tabulated data.<<ETX>>

[1]  Lawrence O'Gorman,et al.  Techniques for Line Drawing Interpretation: An Overview , 1990 .

[2]  S.C. Hinds,et al.  A document skew detection method using run-length encoding and the Hough transform , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.