Using Correctness-by-Construction to Derive Dead-zone Algorithms

We give a derivation, in the form of a stepwise (refinement-oriented) presentation, of a family of algorithms for single keyword pattern matching, all based on the so-called dead-zone algorithm-style, in which input text parts are tracked as either unprocessed (‘live’), or processed (‘dead’). Such algorithms allow for Boyer-Moore-style shifting in the input in two directions (left and right) instead of one, and have shown promising results in practice. The algorithms are the more interesting because of their potential for concurrency (multithreading). The focus of our algorithm family presentation is on correctness-arguments (proofs) accompanying each step, and on the resulting elegance and efficiency. Several new algorithms are described as part of this algorithm family, including ones amenable to using concurrency.

[1]  Derrick G. Kourie,et al.  Performance assessment of dead-zone single keyword pattern matching , 2012, SAICSIT '12.

[2]  Andrew Hume,et al.  Fast string searching , 1991, USENIX Summer.

[3]  Thomas Berry,et al.  A Fast String Matching Algorithm and Experimental Results , 1999, Stringology.

[4]  R. Nigel Horspool,et al.  Practical fast searching in strings , 1980, Softw. Pract. Exp..

[5]  Gerard Zwaan,et al.  A new taxonomy of sublinear right-to-left scanning keyword pattern matching algorithms , 2010, Sci. Comput. Program..

[6]  Derrick G. Kourie,et al.  A Sequential Recursive Implementation of Dead-Zone Single Keyword Pattern Matching , 2012, IWOCA.

[7]  Derrick G. Kourie,et al.  The Correctness-by-Construction Approach to Programming , 2012, Springer Berlin Heidelberg.

[8]  Thierry Lecroq,et al.  A Multiple Sliding Windows Approach to Speed Up String Matching Algorithms , 2012, SEA.

[9]  Bruce W. Watson,et al.  A new family of string pattern matching algorithms , 2003, South Afr. Comput. J..

[10]  Gerard Zwaan,et al.  A Taxonomy of Sublinear Multiple Keyword Pattern Matching Algorithms , 1996, Sci. Comput. Program..

[11]  Derrick G. Kourie,et al.  Experience with correctness-by-construction , 2015, Sci. Comput. Program..

[12]  Dima Suleiman,et al.  A Fast Pattern Matching Algorithm with Two Sliding Windows (TSW) , 2008 .

[13]  Edsger W. Dijkstra,et al.  A Discipline of Programming , 1976 .

[14]  Domenico Cantone,et al.  Improved and self-tuned occurrence heuristics , 2014, J. Discrete Algorithms.

[15]  Robert S. Boyer,et al.  A fast string searching algorithm , 1977, CACM.

[16]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.