PLA using RLSA and a neural network

Abstract This paper describes a new method for document page layout analysis. The proposed approach is based on the use of the run-length smoothing algorithm (RLSA) and a neural network block classified (NNBC). The RLSA is used locally and globally for the block segmentation by using optimal pre-estimated smoothing values. The NNBC is used in the classification steps of the method as a tool which classifies the blocks of the document into basic classes or subclasses. The NNBC consists of a principal component analyzer (PCA) and a self-organized feature map (SOFM). The input feature vector is a set of features corresponding to the contents and the relationships of 3×3 masks. This set is selected by using a statistical selection procedure, and provides textural information. In the final step, and after the application of a grouping procedure, the document blocks are classified as text frames and isolated text lines, graphics and halftones, or into secondary subclasses corresponding to special cases of the basic classes. The proposed method can identify blocks that cannot be separated with horizontal and vertical cuts, and gives very correct classification even on documents of bad scanning quality. The performance of the method has been extensively tested on a variety of documents. Several examples illustrate the strength and the effectiveness of the methodology.

[1]  Mahesh Viswanathan,et al.  A prototype document image analysis system for technical journals , 1992, Computer.

[2]  Anil K. Jain,et al.  Page segmentation using tecture analysis , 1996, Pattern Recognit..

[3]  Kuo-Chin Fan,et al.  Segmentation and classification of mixed text/graphics/image documents , 1994, Pattern Recognit. Lett..

[4]  Eberhard Mandler,et al.  Document analysis-from pixels to contents , 1992 .

[5]  C. Strouthopoulos,et al.  Document block identification using a neural network , 1997, Proceedings of 13th International Conference on Digital Signal Processing.

[6]  Jaime López-Krahe,et al.  System for an intelligent office document analysis, recognition and description , 1993, Signal Process..

[7]  J. Dayho Neural Network Architectures: an Introduction , 1990 .

[8]  Judith E. Dayhoff,et al.  Neural Network Architectures: An Introduction , 1989 .

[9]  Charalambos Strouthopoulos,et al.  Text identification for document image analysis using a neural network , 1998, Image Vis. Comput..

[10]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[11]  James R. Gattiker,et al.  A System for Interpretation of Line Drawings , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Sargur N. Srihari,et al.  Classification of newspaper image blocks using texture analysis , 1989, Comput. Vis. Graph. Image Process..

[13]  Rangachar Kasturi,et al.  A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Abhijit S. Pandya,et al.  Pattern Recognition with Neural Networks in C++ , 1995 .

[15]  C. Strouthopoulos,et al.  Identification of text-only areas in mixed-type documents , 1997 .

[16]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[17]  Yasuaki Nakano,et al.  Segmentation methods for character recognition: from segmentation to document structure analysis , 1992, Proc. IEEE.

[18]  H.M. Wechsler,et al.  Digital image processing, 2nd ed. , 1981, Proceedings of the IEEE.

[19]  Lawrence O'Gorman,et al.  The Document Spectrum for Page Layout Analysis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Ian H. Witten,et al.  Managing gigabytes , 1994 .

[21]  Jiangying Zhou,et al.  Page segmentation and classification , 1992, CVGIP Graph. Model. Image Process..

[22]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[23]  Lawrence O'Gorman,et al.  Document Image Analysis , 1996 .

[24]  D. J. Nolan,et al.  Automatic defect classification of printed wiring board solder joints , 1990 .

[25]  Friedrich M. Wahl,et al.  Block segmentation and text extraction in mixed text/image documents , 1982, Comput. Graph. Image Process..