Text/graphics separation using agent-based pyramid operations

This paper describes a document image analysis system using multiple agents working on a pyramid structure to separate text from graphics in the image. Text strings appear as different groupings of connected components at different image resolutions. As such, the pyramid structure, which is a multi-resolution image representation, provides a natural means of identifying and grouping of character strings in the document at different levels of resolution. The pyramid structure is also amenable to parallel processing, where multiple agents in the system can individually and concurrently look for groups of connected components at appropriate levels. The agent-based pyramid operations do not require expensive feature analysis among different connected components to detect text strings as found in other existing works.

[1]  Hiroshi Maruyama,et al.  Character string extraction by multi-stage relaxation , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[2]  Norihiro Abe,et al.  A clustering-based approach to the separation of text strings from mixed text/graphics documents , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[3]  Chew Lim Tan,et al.  Text extraction using pyramid , 1998, Pattern Recognit..

[4]  S. Tanimoto Pictorial feature distortion in a pyramid , 1976 .

[5]  Chew Lim Tan,et al.  Agent-Based Text Extraction from Pyramid Images , 1999 .

[6]  Rangachar Kasturi,et al.  A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images , 1988, IEEE Trans. Pattern Anal. Mach. Intell..