Multiscale Document Segmentation

In this paper, we propose a new approach to document segmentation which exploits both local texture characteristics and image structure to segment scanned documents into regions such as text, background, headings and images. Our method is based on the use of a multiscale Bayesian framework. This framework is chosen because it allows accurate modeling of both the image characteristics and contextual structure of each region. The parameters which describe the characteristics of typical images are extracted from a database of training images which are produced by scanning typical documents and hand segmenting them into the desired components. This training procedure is based on the expectation maximization (EM) algorithm and results in approximate maximum likelihood (ML) estimates of the model parameters for region textures and contextual structure at various resolutions. Once the training procedure is performed, scanned documents may be segmented using a fine-to-coarse-to-fine procedure that is computationally efficient.

[1]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[4]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[5]  D. Rubin,et al.  Estimation and Hypothesis Testing in Finite Mixture Models , 1985 .

[6]  Sargur N. Srihari,et al.  Classification of newspaper image blocks using texture analysis , 1989, Comput. Vis. Graph. Image Process..

[7]  Anil K. Jain,et al.  Segmentation of Document Images , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  P. Vaidyanathan Multirate Systems And Filter Banks , 1992 .

[9]  Dan S. Bloomberg Multiresolution morphological analysis of document images , 1992, Other Conferences.

[10]  Jaime López-Krahe,et al.  System for an intelligent office document analysis, recognition and description , 1993, Signal Process..

[11]  Mahesh Viswanathan,et al.  Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Charles A. Bouman,et al.  A multiscale random field model for Bayesian image segmentation , 1994, IEEE Trans. Image Process..

[13]  Rama Chellappa,et al.  Page segmentation using decision integration and wavelet packets , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[14]  Robert M. Haralick,et al.  Document image understanding: geometric and logical layout , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Donald Geman,et al.  An Active Testing Model for Tracking Roads in Satellite Images , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Anil K. Jain,et al.  Page segmentation using tecture analysis , 1996, Pattern Recognit..