Table structure recognition based on textblock arrangement and ruled line position

Describes a new method to recognize table structures from document images. Each cell of a table is arranged regularly in two dimensions and is represented by a row-column pair. Even in the absence of ruled lines, its coordinates are explicitly found. Thus, it is assumed that an arrangement of textblocks defines the table structure, which is an arrangement of rows and columns, and ruled lines make clear their relationship. This process is composed of two procedures: the expansion of cell bounding boxes and the assignment of row-column numbers to each edge. It is shown that the method can be applied to partially ruled tables with some experimental results.<<ETX>>