ANASTASIL: A Hybrid Knowledge-Based System for Document Layout Analysis

This paper describes a knowledge-based system for the identification of the different regions of a document image. It uses a hybrid, modular knowledge representation, a so called geometric tree being its essential part. This tree is used to perform a best-first search in combination with a "hypothesize & test"- strategy. It produces an internal, editable description of the entire document and its constituents. The system has been implemented for the analysis of single-sided business letters in Common Lisp on a SUN 3/60 Workstation. It is running for a large population of different business letters. The results obtained have been very encouraging and have convincingly confirmed the soundness of the approach.