Restoration and Segmentation of Highly Degraded Characters Using a Shape-Independent Level Set Approach and Multi-level Classifiers

Segmentation of ancient documents is challenging. In the worst cases, text characters become fragmented as the results of strong degradation processes. New active contour methods allow to handle difficult cases in a spatially coherent fashion. However, most of those method use a restrictive, a priori shape information that limit their application. In this work, we propose to address this issue by combining two complementary approaches. First, multi-level classifiers, which take advantage of the stroke width a priori information, allow to locate candidate character pixels. Second, a level set active contour scheme is used to identify the boundary of a character. Tests have been conducted on a set of ancient degraded Hebraic character images. Numerical results are promising.

[1]  Anil K. Jain,et al.  Goal-Directed Evaluation of Binarization Methods , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Mohamed Cheriet,et al.  RSLDI: Restoration of single-sided low-quality document images , 2009, Pattern Recognit..

[3]  Michael Droettboom Correcting broken characters in the recognition of historical printed documents , 2003, 2003 Joint Conference on Digital Libraries, 2003. Proceedings..

[4]  Anna Tonazzini,et al.  Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[5]  Hong Yan,et al.  Linking broken character borders with variable sized masks to improve recognition , 1996, Pattern Recognit..

[6]  Tony F. Chan,et al.  A Multiphase Level Set Framework for Image Segmentation Using the Mumford and Shah Model , 2002, International Journal of Computer Vision.

[7]  Hong Yan,et al.  Reconstruction of broken handwritten digits based on structural morphological features , 2001, Pattern Recognit..

[8]  Thomas S. Huang,et al.  Image processing , 1971 .

[9]  Apostolos Antonacopoulos,et al.  Special issue on the analysis of historical documents , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[10]  Nadia Bali,et al.  Automatic accurate broken character restoration for patrimonial documents , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[11]  Tony F. Chan,et al.  Active contours without edges , 2001, IEEE Trans. Image Process..

[12]  R. F. Moghaddam,et al.  Low quality document image modeling and enhancement , 2009, International Journal of Document Analysis and Recognition (IJDAR).

[13]  James A. Sethian,et al.  Level Set Methods and Fast Marching Methods , 1999 .

[14]  Mohamed Cheriet,et al.  Image Segmentation Using Level Set and Local Linear Approximations , 2007, ICIAR.

[15]  E.E. Pissaloux,et al.  Image Processing , 1994, Proceedings. Second Euromicro Workshop on Parallel and Distributed Processing.

[16]  Its'hak Dinstein,et al.  Adaptive shape prior for recognition and variational segmentation of degraded historical characters , 2009, Pattern Recognit..

[17]  Pamela C. Cosman,et al.  Dictionary design for text image compression with JBIG2 , 2001, IEEE Trans. Image Process..

[18]  Luciano da Fontoura Costa,et al.  2D Euclidean distance transform algorithms: A comparative survey , 2008, CSUR.

[19]  Rachid Deriche,et al.  Geodesic Active Regions and Level Set Methods for Supervised Texture Segmentation , 2002, International Journal of Computer Vision.