Page segmentation and classification using fast feature extraction and connectivity analysis

Page segmentation and classification are important parts of the document analysis process. The aim is to extract and classify different parts of the page. This paper proposes an approach in which these two phases are combined. The integration process includes fast feature extraction with rule-based classification and label propagation using connectivity analysis providing classified areas in three categories: background, text and picture.

[1]  Yung-Sheng Chen,et al.  Adaptive thresholding algorithm and its hardware implementation , 1994, Pattern Recognit. Lett..

[2]  Tim Ritchings,et al.  Flexible page segmentation using the background , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[3]  Jiangying Zhou,et al.  Page segmentation and classification , 1992, CVGIP Graph. Model. Image Process..

[4]  Matti Pietikäinen,et al.  Skew Angle Detection Using Texture Direction Analysis , 1995 .

[5]  Rama Chellappa,et al.  Page segmentation using decision integration and wavelet packets , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).