Segmentation of color documents by line oriented clustering using spatial information

In this contribution we introduce a new method for global segmentation of color documents with a structure based on text frames and pictures. It is based on an extensive analysis of the expected shape of clusters in RGB-color space. The method provides an improved segmentation over k-means based clustering, and gives a proper basis for indexing and layout analysis. Results are promising.

[1]  Rainer Hoch,et al.  From paper to office document standard representation , 1992, Computer.

[2]  Marcin Paprzycki,et al.  Parallel computing works! , 1996, IEEE Parallel & Distributed Technology: Systems & Applications.

[3]  Sang Uk Lee,et al.  Color image segmentation based on 3-D clustering: morphological approach , 1998, Pattern Recognit..

[4]  Arnold W. M. Smeulders,et al.  Photometric Invariant Region Detection , 1998, BMVC.

[5]  Yoshua Bengio,et al.  High quality document image compression with "DjVu" , 1998, J. Electronic Imaging.

[6]  Daniel P. Lopresti,et al.  Extracting text from WWW images , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[7]  Haruo Asada,et al.  Major components of a complete text reading system , 1992 .