Spatial and Spectral Based Segmentation of Text in Multispectral Images of Ancient Documents

In this paper we propose a character segmentation method for multispectral images of ancient documents. Due to the low quality of the images the main idea of this study is to combine the multispectral behavior and contextual spatial information. Therefore we utilize a Markov Random Field model using the spectral information of the images and stroke properties to include spatial dependencies of the characters. Since the stroke properties and the Gaussian parameters for the imaging model are evaluated automatically the proposed segmentation method requires no training phase. We compared the method to state of the art character segmentation methods and demonstrate the effectiveness of combining spectral and spatial features for the segmentation of characters in multispectral images.

[1]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[2]  Frank Lebourgeois,et al.  Serialized unsupervised classifier for adaptative color image segmentation: application to digitized ancient manuscripts , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[3]  Venu Govindaraju,et al.  Handwritten Carbon Form Preprocessing Based on Markov Random Field , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Venu Govindaraju,et al.  Document image analysis: A primer , 2002 .

[5]  Bernard Gosselin,et al.  Spatial and Color Spaces Combination for Natural Scene Text Extraction , 2006, 2006 International Conference on Image Processing.

[6]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[7]  Venu Govindaraju,et al.  Historical document image enhancement using background light intensity normalization , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[8]  Robert Sablatnig,et al.  Contrast Enhancement in Multispectral Images by Emphasizing Text Regions , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[9]  Zoltan Kato,et al.  A Markov random field image segmentation model for color textured images , 2006, Image Vis. Comput..

[10]  Robert Sablatnig,et al.  Registration of Multi-Spectral Manuscript Images , 2007, VAST.

[11]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[13]  Anna Tonazzini,et al.  Digital image analysis to enhance underwritten text in the Archimedes palimpsest , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[14]  David G. Stork,et al.  Pattern Classification , 1973 .

[15]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[16]  David S. Doermann,et al.  Binarization of low quality text using a Markov random field model , 2002, Object recognition supported by user interaction for service robots.

[17]  Graham Leedham,et al.  Handwritten character skeletonisation for forensic document analysis , 2005, SAC '05.

[18]  Utpal Garain,et al.  On foreground — background separation in low quality document images , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[19]  Konstantinos Rapantzikos,et al.  Hyperspectral imaging: potential in non-destructive analysis of palimpsests , 2005, IEEE International Conference on Image Processing 2005.

[20]  Stan Z. Li,et al.  Markov Random Field Modeling in Computer Vision , 1995, Computer Science Workbench.

[21]  Wenwen Li,et al.  Character segmentation of color images from digital camera , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[22]  Ching Y. Suen,et al.  Color segmentation for text extraction , 2003, Document Analysis and Recognition.

[23]  Roger L. Easton,et al.  Multispectral imaging of the Archimedes palimpsest , 2003, 32nd Applied Imagery Pattern Recognition Workshop, 2003. Proceedings..