Picture detection in document page images

We present a method for picture detection in document page images, which can come from scanned or camera images, or rendered from electronic file formats. Our method uses OCR to separate out the text and applies the Normalized Cuts algorithm to cluster the non-text pixels into picture regions. A refinement step uses the captions found in the OCR text to deduce how many pictures are in a picture region, thereby correcting for under- and over-segmentation. A performance evaluation scheme is applied which takes into account the detection quality and fragmentation quality. We benchmark our method against the ABBYY application on page images from conference papers.

[1]  Apostolos Antonacopoulos,et al.  Performance Analysis Framework for Layout Analysis Methods , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[2]  Bhabatosh Chanda,et al.  A Multi-Scale Morphologic Edge Detector , 1998, Pattern Recognit..

[3]  A. Rahimi,et al.  Clustering with Normalized Cuts is Clustering with a Hyperplane , 2004 .

[4]  Jason J. Corso,et al.  Robust unsupervised segmentation of degraded document images with topic models , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Thomas M. Breuel,et al.  Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Rangachar Kasturi,et al.  A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Wei Xu,et al.  Machine Learning for Multimedia Content Analysis , 2007 .

[9]  Francine Chen,et al.  Extraction of text-related features for condensing image documents , 1996, Electronic Imaging.

[10]  Yihong Gong,et al.  Machine Learning for Multimedia Content Analysis (Multimedia Systems and Applications) , 2007 .

[11]  Xian-Sheng Hua,et al.  Automatic performance evaluation for video text detection , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[12]  Kris Popat,et al.  UpLib: a universal personal digital library system , 2003, DocEng '03.

[13]  Daniel X. Le,et al.  Automated zone correction in bitmapped document images , 1999, Electronic Imaging.