Parameterized pattern matching by Boyer-Moore-type algorithms

This paper investigates generalizations of the Boyer-Moore string pattern matching algorithm to parameterized pattern matching. Parameterized pattern matching was invented by Baker [Bak93b] for the purpose of finding sections of code in a software system that are the same except for a systematic change of parameters. We show that for Boyer-Moore-type algorithms that do not save information about previously matched portions of text, straightforward generalizations to parameterized pattern-matching must have a running time of a(nmin(m,p)), where n is the text length, m is the pattern length, and p is the number of parameter symbols in the alphabet. However, we describe a parameterized pattern matching algorithm PturboBM that has the same overall structure as the Boyer-Moore algorithm but saves information about previously matched portions of text and runs in time O(nlogmin(m,p)), with preprocessing time O(mlogmin(m,p)), where n is the length of the text, m is the length of the pattern, and p is the number of distinct parameter symbols. Experiments show that the PturboBM algorithm has promise as an efficient means of accomplishing parameterized pattern matching for applications.