论文信息 - Conversion of scanned documents to the open document architecture

Conversion of scanned documents to the open document architecture

The paper presents a system for the conversion of scanned documents into the open document architecture. Unlike previous work in this field the authors use a combination of evidence sources to achieve greater robustness to document defects and noise introduced in the scanning process. Furthermore, they use optical character recognition in conjunction with other forms of image analysis as a means of detecting document structure. This enables enhanced document feature extraction and improved performance. They demonstrate the performance of the system on a specific class of input document.<<ETX>>

John P. Oakley | Costas S. Xydeas | Gary S. D. Farrow

[1] Lawrence O'Gorman,et al. The RightPages image-based electronic library for alerting and browsing , 1992, Computer.

[2] Friedrich M. Wahl,et al. Document Analysis System , 1982, IBM J. Res. Dev..

[3] Andreas Dengel,et al. ANASTASIL: A System for Low-Level and High-Level Geometric Analysis of Printed Documents , 1992 .

[4] Mahesh Viswanathan,et al. A prototype document image analysis system for technical journals , 1992, Computer.

[5] Mahesh Viswanathan,et al. Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[6] Jaime López-Krahe,et al. System for an intelligent office document analysis, recognition and description , 1993, Signal Process..