Direct Processing of Run Length Compressed Document Image for Segmentation and Characterization of a Specified Block

Extracting a block of interest referred to as segmenting a specified block in an image and studying its characteristics is of general research interest, and could be a challenging if such a segmentation task has to be carried out directly in a compressed image. This is the objective of the present research work. The proposal is to evolve a method which would segment and extract a specified block, and carry out its characterization without decompressing a compressed image, for two major reasons that most of the image archives contain images in compressed format and ‘decompressing’ an image indents additional computing time and space. Specifically in this research work, the proposal is to work on run-length compressed document images.

[1]  Reiner Eschbach,et al.  Fast Segmentation of JPEG-Compressed Documents , 1999 .

[2]  Marcus Liwicki,et al.  Signature Segmentation from Document Images , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[3]  Masashi Koga,et al.  A high-speed algorithm for propagation-type labeling based on block sorting of runs in binary images , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[4]  Peter Bauer,et al.  Text, photo, and line extraction in scanned documents , 2012, J. Electronic Imaging.

[5]  K. Varshney Block-segmentation and Classification of Grayscale Postal Images , 2003 .

[6]  Bidyut Baran Chaudhuri,et al.  Extraction of line-word-character segments directly from run-length compressed printed text-documents , 2013, 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG).

[7]  Jonathan J. Hull,et al.  Detecting duplicates among symbolically compressed images in a large document database , 2001, Pattern Recognit. Lett..

[8]  G. Grant,et al.  An efficient algorithm for boundary tracing and feature extraction , 1981 .

[9]  Bidyut Baran Chaudhuri,et al.  Entropy Computations of Document Images in Run-Length Compressed Domain , 2014, 2014 Fifth International Conference on Signal and Image Processing.

[10]  Kazem Taghva,et al.  Document analysis by processing JBIG-encoded images , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[11]  Reiner Eschbach,et al.  Segmentation of compressed documents , 1997, Proceedings of International Conference on Image Processing.

[12]  J. O. Limb,et al.  Run-length coding of television signals , 1965 .

[13]  Eduard H. Hovy,et al.  Layout-aware text extraction from full-text PDF of scientific articles , 2012, Source Code for Biology and Medicine.

[14]  Jack Capon,et al.  A probabilistic model for run-length coding of pictures , 1959, IRE Trans. Inf. Theory.

[15]  P. Nagabhushan,et al.  Entropy Quantifiers Useful for Establishing Equivalence between Text Document Images , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[16]  Thomas M. Breuel Binary Morphology and Related Operations on Run-Length Representations , 2008, VISAPP.

[17]  Shahram Latifi,et al.  An Algorithm with Reduced Operations for Connected Components Detection in ITU-T Group 3/4 Coded Images , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  C. Lee Giles,et al.  Automatic Extraction of Data Points and Text Blocks from 2-Dimensional Plots in Digital Documents , 2008, AAAI.

[19]  A. Lawrence Spitz Analysis of Compressed Document Images for Dominant Skew, Multiple Skew, and Logotype Detection , 1998, Comput. Vis. Image Underst..

[20]  Bidyut Baran Chaudhuri,et al.  Extraction of Projection Profile, Run-Histogram and Entropy Features Straight from Run-Length Compressed Text-Documents , 2013, ACPR.

[21]  Yue Lu,et al.  Document retrieval from compressed images , 2003, Pattern Recognit..

[22]  Karim Faez,et al.  A novel method for extracting and recognizing logos , 2012 .

[23]  Marco Furini,et al.  International Journal of Computer and Applications , 2010 .

[24]  Jonathan J. Hull Document image similarity and equivalence detection , 1998, International Journal on Document Analysis and Recognition.