Generic filtering and removing artefacts from document images using unsupervised PSO optimisation

The advancements in the field of analysis and optical recognition of document images have accelerated recently due to the many emerging applications which are not only challenging but also computationally more demanding, such as mail and document sorting, automatic classification of documents, handwriting and script recognition, etc. In this paper, our contribution focuses on preprocessing of these applications: smoothing and filtering of degraded document images using a new adaptive mean shift algorithm based on the integral image. The great difficulty of parameter setting of this approach requires solving of complex optimisation problems using metaheuristic algorithms. Our goal is to demonstrate the contribution of the particle swarm optimisation (PSO) method to improve the quality and the parameter setting of the developed preprocessing approach. We tested and compared two types of objective functions (supervised and unsupervised) and demonstrate the effectiveness of the optimisation in an unsupervised context.

[1]  Christian Blum,et al.  Hybrid metaheuristics in combinatorial optimization: A survey , 2011, Appl. Soft Comput..

[2]  R. F. Moghaddam,et al.  Low quality document image modeling and enhancement , 2009, International Journal of Document Analysis and Recognition (IJDAR).

[3]  Véronique Eglin,et al.  Improvement of postal mail sorting system , 2008, International Journal of Document Analysis and Recognition (IJDAR).

[4]  Venu Govindaraju,et al.  Preprocessing of Low-Quality Handwritten Documents Using Markov Random Fields , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Ioannis Pratikakis,et al.  ICDAR 2011 Document Image Binarization Contest (DIBCO 2011) , 2011, 2011 International Conference on Document Analysis and Recognition.

[6]  P. S. Jonesherine,et al.  Ancient Degraded Document Binarization Using Mean Shift Technique , 2015 .

[7]  Ashish Ghosh,et al.  Gray-level Image Enhancement By Particle Swarm Optimization , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[8]  Venu Govindaraju,et al.  Historical document image enhancement using background light intensity normalization , 2004, ICPR 2004.

[9]  Frank Lebourgeois,et al.  Denoising Textual Images Using Local/Non-local Smoothing Filters: A Comparative Study , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[10]  Zbigniew Michalewicz,et al.  Evolutionary Algorithms for Constrained Parameter Optimization Problems , 1996, Evolutionary Computation.

[11]  Thomas Bäck,et al.  An Overview of Evolutionary Algorithms for Parameter Optimization , 1993, Evolutionary Computation.

[12]  Jérôme Darbon,et al.  Enhancement of historical printed document images by combining Total Variation regularization and Non-local Means filtering , 2011, Image Vis. Comput..

[13]  Matti Pietikäinen,et al.  Adaptive document binarization , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[14]  Marco Dorigo,et al.  Ant colony optimization theory: A survey , 2005, Theor. Comput. Sci..

[15]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[16]  K. Manikantan,et al.  Optimal Feature Selection based on Image Pre-processing using Accelerated Binary Particle Swarm Optimization for Enhanced Face Recognition , 2012 .

[17]  Ophir Frieder,et al.  Automatic Enhancement and Binarization of Degraded Document Images , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[18]  Sergio Nesmachnow,et al.  An overview of metaheuristics: accurate and efficient methods for optimisation , 2014, Int. J. Metaheuristics.

[19]  Mohamed Cheriet,et al.  EFDM : Restoration of Single-sided Low-quality Document Images , 2008 .

[20]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[21]  I. Introduction,et al.  An Overview of PSO- Based Approaches in Image Segmentation , 2012 .

[22]  G. Das,et al.  A novel hybrid approach to restore historical degraded documents , 2013, 2013 International Conference on Intelligent Systems and Signal Processing (ISSP).

[23]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Christian Blum,et al.  Metaheuristics in combinatorial optimization: Overview and conceptual comparison , 2003, CSUR.

[25]  Jacek M. Zurada,et al.  An approach to multimodal biomedical image registration utilizing particle swarm optimization , 2004, IEEE Transactions on Evolutionary Computation.

[26]  Shijian Lu,et al.  Robust Document Image Binarization Technique for Degraded Document Images , 2013, IEEE Transactions on Image Processing.

[27]  A. Rezaee Jordehi,et al.  Parameter selection in particle swarm optimisation: a survey , 2013, J. Exp. Theor. Artif. Intell..

[28]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Venu Govindaraju,et al.  Historical document image enhancement using background light intensity normalization , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[30]  Les Kitchen,et al.  Edge Evaluation Using Local Edge Coherence , 1981, IEEE Transactions on Systems, Man, and Cybernetics.

[31]  Nikos A. Nikolaou,et al.  Color reduction for complex document images , 2009, Int. J. Imaging Syst. Technol..

[32]  A. Rezaee Jordehi,et al.  A review on constraint handling strategies in particle swarm optimisation , 2015, Neural Computing and Applications.

[33]  Rachid Deriche,et al.  Vector-valued image regularization with PDE's: a common framework for different applications , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[34]  Jitendra Malik,et al.  Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  A. Rezaee Jordehi,et al.  Enhanced leader PSO (ELPSO): A new PSO variant for solving global optimisation problems , 2015, Appl. Soft Comput..

[36]  Antonella Carbonaro,et al.  Ant Colony Optimization: An Overview , 2002 .

[37]  Zbigniew Michalewicz,et al.  Parameter Control in Evolutionary Algorithms , 2007, Parameter Setting in Evolutionary Algorithms.

[38]  N. Kamaraj,et al.  Segmentation of pulmonary parenchyma in CT lung images based on 2D Otsu optimized by PSO , 2011, 2011 International Conference on Emerging Trends in Electrical and Computer Technology.

[39]  Rachid Deriche,et al.  Vector-valued image regularization with PDEs: a common framework for different applications , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Frank Lebourgeois,et al.  A new PDE-based approach for singularity-preserving regularization: application to degraded characters restoration , 2012, International Journal on Document Analysis and Recognition (IJDAR).

[41]  Aniati Murni Arymurthy,et al.  Image Enhancement and Image Restoration for Old Document Image Using Genetic Algorithm , 2010, 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies.

[42]  Amitava Chatterjee,et al.  A hybrid cooperative-comprehensive learning based PSO algorithm for image segmentation using multilevel thresholding , 2008, Expert Syst. Appl..