A functional classification approach to layout analysis of document images

Layout analysis is both the segmentation and labeling of document images for automatic document input systems. The authors propose a layout analysis method based on a pattern classification scheme. They define the feature space in terms of low-level image processing features such as connected components and projection profiles. The classifier assigns each connected component its logical label according to its features. Publication specific information is kept in the reference vector dictionary. An experiment using technical journal title pages gives connected component level recognition rates of 95% and 81% for learning and unknown samples, respectively.<<ETX>>

[1]  Donato Malerba,et al.  An experimental page layout recognition system for office document automatic classification: an integrated approach for inductive generalization , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[2]  Noboru Babaguchi,et al.  Model based system for analyzing document images , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.