MULTISCALE PAGE SEGMENTATION USING WAVELET PACKET ANALYSIS

In this paper, a novel method for document page segmentation using Wavelet Packet analysis is proposed. To discriminate between text and non-text regions, the image is represented by means of a wavelet packet analysis tree. Successively a feature image is introduced to synthetize the information related to some nodes selected from the quadtree. The most discriminant nodes are derived using an optimality criterion and a genetic algorithm. Finally the selected feature image is segmented by means of a Fuzzy C-Means clustering. The approach provides good segmentation results and shows to be invariant to page skew and font variations. [ DOI : 10.1685 / CSC06090] About DOI

[1]  John W. Sammon,et al.  An Optimal Discriminant Plane , 1970, IEEE Transactions on Computers.

[2]  Sargur N. Srihari,et al.  Classification of newspaper image blocks using texture analysis , 1989, Comput. Vis. Graph. Image Process..

[3]  Ch Leung,et al.  Layout Analysis and Segmentation of Chinese Newspaper Articles , 1994 .

[4]  Cameron L. Jones,et al.  Wavelet packet computation of the Hurst exponent , 1996 .

[5]  Anil K. Jain,et al.  Document Representation and Its Application to Page Decomposition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Ronald R. Coifman,et al.  Entropy-based algorithms for best basis selection , 1992, IEEE Trans. Inf. Theory.

[7]  Yuan Yan Tang,et al.  Automatic document processing: A survey , 1996, Pattern Recognit..

[8]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[9]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[10]  Norihiro Hagita,et al.  Automated entry system for printed documents , 1990, Pattern Recognit..

[11]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Hong-Ye Gao,et al.  Applied wavelet analysis with S-plus , 1996 .

[13]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[14]  Mahesh Viswanathan,et al.  Syntactic Segmentation and Labeling of Digitized Pages from Technical Journals , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Ronald R. Coifman,et al.  Signal processing and compression with wavelet packets , 1994 .

[16]  Rafael C. González,et al.  Digital image processing, 3rd Edition , 2008 .

[17]  Ronald R. Coifman,et al.  Wavelet analysis and signal processing , 1990 .

[18]  David Salesin,et al.  Wavelets for computer graphics: theory and applications , 1996 .

[19]  Rangachar Kasturi,et al.  A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Shahram Latifi,et al.  Document segmentation using polynomial spline wavelets , 2001, Pattern Recognit..

[21]  Mahesh Viswanathan,et al.  A prototype document image analysis system for technical journals , 1992, Computer.

[22]  Seong-Whan Lee,et al.  Extraction of reference lines and items from form document images with complicated background , 2005, Pattern Recognit..