A splicing-driven memetic algorithm for reconstructing cross-cut shredded text documents

Graphical abstractDisplay Omitted HighlightsDevelop a splicing-driven memetic algorithm to reconstruct cross-cut shredded text documents.Design a comprehensive cost function to evaluate solutions and to guide individual search.Design novel reproduction operators to effectively utilize the adjacency information of shreds.Propose an elitism-based local search strategy to further enhance efficiency.Obtain good reconstruction performance in terms of the solution quality and convergence speed. Reconstruction of cross-cut shredded text documents (RCCSTD) plays a crucial role in many fields such as forensic and archeology. To handle and reconstruct the shreds, in addition to some image processing procedures, a well-designed optimization algorithm is required. Existing works adopt some general methods in these two aspects, which may not be very efficient since they ignore the specific structure or characteristics of RCCSTD. In this paper, we develop a splicing-driven memetic algorithm (SD-MA) specifically for tackling the problem. As the name indicates, the algorithm is designed from a splicing-centered perspective, in which the operators and fitness evaluation are developed for the purpose of splicing the shreds. We design novel crossover and mutation operators that utilize the adjacency information in the shreds to breed high-quality offsprings. Then, a local search strategy based on shreds is performed, which further improves the evolution efficiency of the population in complex search space. To extract valid information from shreds and improve the accuracy of splicing costs, we propose a comprehensive objective function that considers both edge and empty row-based splicing errors. Experiments are carried out on 30 RCCSTD scenarios and comparisons are made against previous best-known algorithms. Experimental results show that the proposed SD-MA displays a significantly improved performance in terms of solution accuracy and convergence speed.

[1]  Matthias Prandtstetter,et al.  Combining Forces to Reconstruct Strip Shredded Text Documents , 2008, Hybrid Metaheuristics.

[3]  Edmund K. Burke,et al.  A Memetic Algorithm for University Exam Timetabling , 1995, PATAT.

[4]  Xiaodong Li,et al.  Cooperatively Coevolving Particle Swarms for Large Scale Optimization , 2012, IEEE Transactions on Evolutionary Computation.

[5]  M. Esmel ElAlami,et al.  Extracting rules from trained neural network using GA for managing E-business , 2004, Appl. Soft Comput..

[6]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[7]  Lichao Cao,et al.  Improved particle swarm optimization algorithm and its application in text feature selection , 2015, Appl. Soft Comput..

[8]  Azzam Sleit,et al.  An alternative clustering approach for reconstructing cross cut shredded text documents , 2011, Telecommunication Systems.

[9]  Qiuzhen Lin,et al.  A novel hybrid multi-objective immune algorithm with adaptive differential evolution , 2015, Comput. Oper. Res..

[10]  David E. Goldberg,et al.  Alleles, loci and the traveling salesman problem , 1985 .

[11]  Jun Zhang,et al.  Adaptive Particle Swarm Optimization , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[13]  Zhi-Hui Zhan,et al.  An Efficient Resource Allocation Scheme Using Particle Swarm Optimization , 2012, IEEE Transactions on Evolutionary Computation.

[14]  Shijian Lu,et al.  Automatic Detection of Document Script and Orientation , 2007 .

[15]  Amer Draa,et al.  On the performances of the flower pollination algorithm - Qualitative and quantitative analyses , 2015, Appl. Soft Comput..

[16]  Meie Shen,et al.  Differential Evolution With Two-Level Parameter Adaptation , 2014, IEEE Transactions on Cybernetics.

[17]  M.G. Strintzis,et al.  Shredded document reconstruction using MPEG-7 standard descriptors , 2004, Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004..

[18]  Jun Zhang,et al.  Optimizing the Vehicle Routing Problem With Time Windows: A Discrete Particle Swarm Optimization Approach , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[19]  Marjan Mernik,et al.  Improving Grammar Inference by a Memetic Algorithm , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[20]  Bo Liu,et al.  An Effective PSO-Based Memetic Algorithm for Flow Shop Scheduling , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[21]  Frank Pettersson,et al.  A genetic algorithms based multi-objective neural net applied to noisy blast furnace data , 2007, Appl. Soft Comput..

[22]  Jun Zhang,et al.  Small-world particle swarm optimization with topology adaptation , 2013, GECCO '13.

[23]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..

[24]  Pablo Moscato,et al.  A Gentle Introduction to Memetic Algorithms , 2003, Handbook of Metaheuristics.

[25]  Edson Justino,et al.  Reconstructing shredded documents through feature matching. , 2006, Forensic science international.

[26]  Marjan Mernik,et al.  Replication and comparison of computational experiments in applied evolutionary computing: Common pitfalls and guidelines to avoid them , 2014, Appl. Soft Comput..

[27]  Pierre Hansen,et al.  Variable Neighborhood Search , 2018, Handbook of Heuristics.

[28]  Matthias Prandtstetter,et al.  A Memetic Algorithm for Reconstructing Cross-Cut Shredded Text Documents , 2010, Hybrid Metaheuristics.

[29]  Günther R. Raidl,et al.  Enhancing Genetic Algorithms by a Trie-Based Complete Solution Archive , 2010, EvoCOP.

[30]  D. J. Smith,et al.  A Study of Permutation Crossover Operators on the Traveling Salesman Problem , 1987, ICGA.

[31]  L. Darrell Whitley,et al.  A Comparison of Genetic Sequencing Operators , 1991, ICGA.

[32]  Matthias Prandtstetter,et al.  Meta-heuristics for reconstructing cross cut shredded text documents , 2009, GECCO.

[33]  Zhi-hui Zhan,et al.  Kuhn–Munkres Parallel Genetic Algorithm for the Set Cover Problem and Its Application to Large-Scale Wireless Sensor Networks , 2016, IEEE Transactions on Evolutionary Computation.

[34]  Dervis Karaboga,et al.  On clarifying misconceptions when comparing variants of the Artificial Bee Colony Algorithm by offering a new implementation , 2015, Inf. Sci..

[35]  Lawrence Davis,et al.  Applying Adaptive Algorithms to Epistatic Domains , 1985, IJCAI.

[36]  Yew-Soon Ong,et al.  A proposition on memes and meta-memes in computing for higher-order learning , 2009, Memetic Comput..

[37]  Yew-Soon Ong,et al.  Memetic Computation—Past, Present & Future [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.

[38]  Matej Crepinsek,et al.  A note on teaching-learning-based optimization algorithm , 2012, Inf. Sci..

[39]  Qiuzhen Lin,et al.  A double-module immune algorithm for multi-objective optimization problems , 2015, Appl. Soft Comput..

[40]  Marjan Mernik,et al.  A chess rating system for evolutionary algorithms: A new method for the comparison and ranking of evolutionary algorithms , 2014, Inf. Sci..

[41]  David E. Goldberg,et al.  AllelesLociand the Traveling Salesman Problem , 1985, ICGA.

[42]  Pablo Cortés,et al.  Genetic algorithm for controllers in elevator groups: analysis and simulation during lunchpeak traffic , 2004, Appl. Soft Comput..

[43]  Marjan Mernik,et al.  Is a comparison of results meaningful from the inexact replications of computational experiments? , 2016, Soft Comput..

[44]  Ya Wang,et al.  A Two-Stage Approach for Reconstruction of Cross-Cut Shredded Text Documents , 2014, 2014 Tenth International Conference on Computational Intelligence and Security.

[45]  Kevin Kok Wai Wong,et al.  Classification of adaptive memetic algorithms: a comparative study , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).