Document region classification using low-resolution images: a human visual perception approach

This paper describes the design of a document region classifier. The regions of a document are classified as large text regions, LTR, and non-LTR. The foundations of the classifier are derived from human visual perception theories. The theories analyzed are texture discrimination based on textons, and perceptual grouping. Based on these theories, the classification task is stated as a texture discrimination problem and is implemented as a preattentive process. Once the foundations of the classifier are defined, engineering techniques are developed to extract features for deciding the class of information contained in the regions. The feature derived from the human visual perception theories is a measurement of periodicity of the blobs of the text regions. This feature is used to design a statistical classifier based on the minimum probability of error criterion to perform the classification of LTR and non-LTR. The method is test on free format low resolution document images achieving 93% of correct recognition.

[1]  C T Scialfa,et al.  Preferential processing of target features in texture segmentation , 1995, Perception & psychophysics.

[2]  B Julesz,et al.  Experiments in the visual perception of texture. , 1975, Scientific American.

[3]  Rama Chellappa,et al.  Page segmentation using decision integration and wavelet packets , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[4]  J. McCafferty Human and machine vision: computing perceptual organisation , 1990 .

[5]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[6]  Lawrence O'Gorman,et al.  Document Image Analysis , 1996 .

[7]  B. Julesz,et al.  Human factors and behavioral science: Textons, the fundamental elements in preattentive vision and perception of textures , 1983, The Bell System Technical Journal.

[8]  M.I.C. Murguia,et al.  Document segmentation using texture variance and low resolution images , 1998, 1998 IEEE Southwest Symposium on Image Analysis and Interpretation (Cat. No.98EX165).