论文信息 - Automated entry system for printed documents

Automated entry system for printed documents

Abstract This paper proposes a system for automatically reading either Japanese or English documents that have complex layout structures that include graphics. First, document image segmentation and character segmentation are carried out using three basic features and the knowledge of document layout rules. Next, multi-font character recognition is performed based on feature vector matching. Recognition experiments with a prototype system for a variety of complex printed documents shows that the proposed system is capable of reading different types of printed documents at an accuracy rate of 94.8–97.2%.

Norihiro Hagita | Teruo Akiyama | N. Hagita | Teruo Akiyama

[1] Julius T. Tou,et al. Pattern Recognition Principles , 1974 .

[2] Gerd Maderlechner,et al. Knowledge-Based Interpretation Of Scanned Business Letters , 1989, Photonics West - Lasers and Applications in Science and Engineering.

[3] G. Ciardiello,et al. An experimental system for office document handling and text recognition , 1988 .

[4] Toshio Kondo,et al. Kanji Character Recognition Unit With Hand-Scanner Using SIMD Processor , 1988, Other Conferences.

[5] Friedrich M. Wahl,et al. Document Analysis System , 1982, IBM J. Res. Dev..

[6] Sargur N. Srihari,et al. A Rule-Based System for Document Understanding , 1986, AAAI.