Robustness Assessment of Texture Features for the Segmentation of Ancient Documents

For the segmentation of ancient digitized document images, it has been shown that texture feature analysis is a consistent choice for meeting the need to segment a page layout under significant and various degradations. In addition, it has been proven that the texture-based approaches work effectively without hypothesis on the document structure, neither on the document model nor the typographical parameters. Thus, by investigating the use of texture as a tool for automatically segmenting images, we propose to search homogeneous and similar content regions by analyzing texture features based on a multiresolution analysis. The preliminary results show the effectiveness of the texture features extracted from the autocorrelation function, the Grey Level Co-occurrence Matrix (GLCM), and the Gabor filters. In order to assess the robustness of the proposed texture-based approaches, images under numerous degradation models are generated and two image enhancement algorithms (non-local means filtering and superpixel techniques) are evaluated by several accuracy metrics. This study shows the robustness of texture feature extraction for segmentation in the case of noise and the uselessness of a demising step.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  Elisa H. Barney Smith,et al.  Pre-Processing of Degraded Printed Documents by Non-local Means and Total Variation , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[3]  Jean-Yves Ramel,et al.  Document image characterization using a multiresolution analysis of the texture: application to old documents , 2008, International Journal of Document Analysis and Recognition (IJDAR).

[4]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[5]  G. N. Lance,et al.  A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems , 1967, Comput. J..

[6]  Muriel Visani,et al.  An efficient parametrization of character degradation model for semi-synthetic image generation , 2013, HIP '13.

[7]  Jihad El-Sana,et al.  Robust text and drawing segmentation algorithm for historical documents , 2013, HIP '13.

[8]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Herng-Hua Chang,et al.  Gaussian noise estimation with superpixel classification in digital images , 2012, 2012 5th International Congress on Image and Signal Processing.

[10]  Stefano Soatto,et al.  Class segmentation and object localization with superpixel neighborhoods , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Rama Chellappa,et al.  Entropy rate superpixel segmentation , 2011, CVPR 2011.

[13]  Rémy Mullot,et al.  Texture feature evaluation for segmentation of historical document images , 2013, HIP '13.

[14]  Jérôme Darbon,et al.  Image Restoration with Discrete Constrained Total Variation Part I: Fast and Exact Optimization , 2006, Journal of Mathematical Imaging and Vision.

[15]  Venu Govindaraju,et al.  Text - image separation in Devanagari documents , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..