String Range Matching

Given strings X and Y the exact string matching problem is to find the occurrences of Y as a substring of X. An alternative formulation asks for the lexicographically consecutive set of suffixes of X that begin with Y. We introduce a generalization called string range matching where we want to find the suffixes of X that are in an arbitrary lexicographical range bounded by two strings Y and Z. The problem has applications in distributed suffix sorting, where Y and Z are themselves suffixes of X. Exact string matching can be solved in linear time and constant extra space under the standard comparison model. Our conjecture is that string range matching is a harder problem and cannot be solved within the same time–space complexity. In this paper, we trace the upper bound on the complexity of string range matching by describing algorithms that are within a logarithmic factor of the time–space complexity of exact string matching, as well as variants of the problem and the model that can be solved in linear time and constant extra space.