An ACL2 Proof of the Correctness of the Preprocessing for a Variant of the Boyer-Moore Fast String Searching Algorithm

We describe a mechanically checked proof that a straightforward implementation of the preprocessing for a version of the Boyer-Moore Fast String Searching Algorithm is correct. We say “straightforward” because the implementation does not attempt to do the preprocessing quickly (unlike production implementations). We say “a version” of the algorithm because the algorithm verified is not the one in the classic paper but one that uses a single 2-dimensional skip table. The proof is done at the “JVM bytecode” level. We implement a formal model of a subset of the JVM, M3, in ACL2, generate the bytecode for the preprocessing algorithm in M3, formally specify the methods and verify the effects of executing those methods. The top-level theorem proves that as a result of calling the preprocessing algorithm on a given string pattern we get a correctly set 2-dimensional array containing the skip information for the fast string searching algor ithm. The proof includes verifying four methods, three singly-nested loops and a doubly-nested loop. Because 2-dimensional arrays are represented as 1-dimensional arrays of references to 1-dimensional arrays, our proof involves pointer manipulation.