String Matching Under a General Matching Relation

In standard string matching, each symbol matches only itself, In other string matching problems, e.g., the string matching with "don?t-cares" problem, a symbol may match several symbols. In general, an arbitrary many-to-many matching relation might hold between symbols. We consider a general string matching problem in which such a matching relation is specified and those positions in a text t, of length n, are sought at which the pattern p, of length m, matches under this relation. Depending upon the existence of a simple and easily recognizable property in the given matching relation, we show that string matching either requires linear (i.e., O(n + m)) time or is at least as hard as boolean convolution. As an application, we show that the matching relations of several independently studied string matching problems do indeed fall into the latter (hard) category. We also give a generic string matching algorithm that works far any matching relation and has complexity o(nm) except for very "large" matching relations.