User-defined template for identifying document type and extracting information from documents

An automatic document entry system is described that identifies the type of document and extracts textual information, such as titles or authors, from semi-formatted document images. The system registers documents, offers easy retrieval of documents used in a daily workflow analyzes the layout structure of documents by using document specific models, and assumes that each type of document is known in advance. In this paper we focus on a method for identifying the type of document.

[1]  Frank Manola,et al.  PROBE Spatial Data Modeling and Query Processing in an Image Database Application , 1988, IEEE Trans. Software Eng..

[2]  Toyohide Watanabe,et al.  Automatic acquisition of layout knowledge for understanding business cards , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[3]  Yuan Yan Tang,et al.  Information acquisition and storage of forms in document processing , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[4]  JONATHAN J. HULL,et al.  THE INFINITE MEMORY MULTIFUNCTION MACHINE , 1998 .

[5]  Takashi Saitoh,et al.  A Layout-Free Method for Extracting Elements from Document Images , 1998, Document Analysis Systems.

[6]  Claudia Wenzel Supporting information extraction from printed documents by Lexico-Semantic pattern matching , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[7]  Nizam Ahmed,et al.  An Intelligent Document Understanding & Reproduction System , 1994, MVA.

[8]  Toyohide Watanabe,et al.  Layout Recognition of Multi-Kinds of Table-Form Documents , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Hiroyuki Arai,et al.  Form processing based on background region analysis , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[10]  Suh-Yin Lee,et al.  Retrieval of similar pictures on pictorial databases , 1991, Pattern Recognit..