Document Recognition System with Layout Structure Generator

1. Abstract A document input system, with character recognition technique, is used for converting printed matter, such as books and magazines, into code-format information. In order to improve this document input system's performance, an appropriate document structure analysis technique is indispen~able(''~'). When storing data from general printed documents into a database, it is necessary to represent the document structure. Therefore, a document layout structure generation method is especially important(*)(6). For this purpose, the authors have developed a document image structure analysis method to generate a layout structure, as well as to detect such document elements as characters, pictures and figures. This method was developed on a personal computer. Its usability is described in this paper.

[1]  Yoshitake Tsuji,et al.  Document Image Analysis For Reading Books , 1987, Other Conferences.

[2]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[3]  Yoshitake Tsuji Document image analysis for generating syntactic structure description , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[4]  Y. Tsuji,et al.  Character image segmentation, based upon minimum variance criterion , 1985 .

[5]  T. Temma,et al.  Data flow processor chip for image processing , 1985, IEEE Transactions on Electron Devices.