A Coupled Mean Shift-Anisotropic Diffusion Approach for Document Image Segmentation and Restoration

Mean shift, a powerful color clustering approach successfully applied to image segmentation, has two main properties that are relevant for use in document image segmentation. These properties include: the autonomous definition of both color clusters' centers and numbers and the good tolerance to noisy data sets. Hence, mean shift could robustly process degraded background document images and improve their legibility. Nevertheless, this paper proves that coupling this approach and anisotropic diffusion within a joint iterative framework has more interesting results. For instance, this framework generates segmented images with more reduced artefacts on edges and background than those obtained after applying each method alone. This improvement is explained by the mutual interaction of global and local information, respectively introduced by the mean shift and anisotropic diffusion, and by the nature of this latter, smoothing while preserving continuities across edges. Some experiments, done on real ancient document images, illustrate these ideas and indicate that our proposed framework provides an efficient tool for document image segmentation and restoration.

[1]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[2]  Jitendra Malik,et al.  Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Dorin Comaniciu,et al.  A common framework for nonlinear diffusion, adaptive smoothing, bilateral filtering and mean shift , 2004, Image Vis. Comput..

[5]  Fadoua Drira,et al.  Towards restoring historic documents degraded over time , 2006, Second International Conference on Document Image Analysis for Libraries (DIAL'06).