Layout analysis of complex documents

Many methods for understanding document images have been investigated to extract/classify meaningful information from paper-based documents. These methods focused on the same types/classes of documents whose layout structures can be represented by single specification. However, widely used documents are not always simple, but composed of different types/classes of simple ones: these documents cannot be represented by a simple document model. In this paper, we address the layout analysis of complex documents. In particular, the knowledge representation of layout structure is discussed. Although the representation means are divided into structure description and operator-specification, our representation means is based on operator-specification.