Table processing is a subdomain of document analysis technology. It commonly focuses on understanding the well-organized information presented in a table and then making an entry for the desired items. Earlier research is all based on the assumption that data is presented in white-background/black-text (WB/BT) binary type. This condition, however, is not always held for color tables. Thus, a preprocessing stage is required to transform the color table into the binary format before the existing techniques can be employed to handle them. In this paper we propose a method for completing such a conversion. The underlying idea of our approach is based on location of background components (i.e., the image background and table cells) together with their colors. After these background regions are extracted, we can then convert the pixels belonging to the background regions to white and other pixels to black. Since our processing scheme needs no prior knowledge of the color style of the input tables, it has the ability to transform a wide fashion of color tables into the WB/BT binary-type, even though they are scanned in a severely skewed manner.
[1]
S.W. Lam,et al.
Anatomy of a form reader
,
1993,
Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).
[2]
Sargur N. Srihari,et al.
Analysis of Form Images
,
1994,
Int. J. Pattern Recognit. Artif. Intell..
[3]
Kuo-Chin Fan,et al.
Extraction of characters from form documents by feature point clustering
,
1995,
Pattern Recognit. Lett..
[4]
Toyohide Watanabe,et al.
Layout Recognition of Multi-Kinds of Table-Form Documents
,
1995,
IEEE Trans. Pattern Anal. Mach. Intell..
[5]
Zen Chen,et al.
Identification of business forms using relationships between adjacent frames
,
1996
.