Image Coding for Binary Document Based on the Regional Features

Lossless coding is commonly found in binary image encoding with lower compression ratio. In this paper, image segmentation is used to classify the document image into line image regions, text image regions and halftone image regions. According to different features of each region, different encoding methods are applied to improve the image compression ratio. Adaptive arithmetic coding is used for line image regions, while symbols dictionary encoding for text image regions and the vector quantization coding for halftone image regions. Experiments show that this method can effectively improve the compression ratio of binary document image.

[1]  Shi Yun-qing Reversible binary image data hiding by histogram modification , 2009 .

[2]  P.G. Howard Lossless and lossy compression of text images by soft pattern matching , 1996, Proceedings of Data Compression Conference - DCC '96.

[3]  Geng Ye Lossless Compression Method for Ordered Dither Halftone , 2009 .

[4]  Alan C. Bovik,et al.  Design and quality assessment of forward and inverse error diffusion halftoning algorithms , 1998 .

[5]  Anil K. Jain,et al.  Document Representation and Its Application to Page Decomposition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  T. Pavlidis,et al.  Page segmentation without rectangle assumption , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.