A Graphic Page Image Processing System

Patent images maintained by the U.S. patent database have figures and related description separated in different pages. This makes it difficult for users to refer to a figure while reading the description. In order to prepare these patent images for a friendly user interface, this paper presents a system, which is able to segment a mulitple-figure image page into individual figures and extract text information like captions and labels from the figure. After obtaining captions and labels, figures and the related description could be linked together, and thus users could easily refer from a description to the figure or vice versa.

[1]  Rafael C. González,et al.  Local Determination of a Moving Contrast Edge , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Mahesh Viswanathan,et al.  A prototype document image analysis system for technical journals , 1992, Computer.

[3]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[4]  Lawrence O'Gorman,et al.  The Document Spectrum for Page Layout Analysis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Chew Lim Tan,et al.  Finding the Best-Fit Bounding-Boxes , 2006, Document Analysis Systems.

[6]  Motoi Iwata,et al.  Segmentation of Page Images Using the Area Voronoi Diagram , 1998, Comput. Vis. Image Underst..