Simpler FM-index for parameterized string matching

Abstract In parameterized string matching, a string comprises static and parameterized symbols. Two strings are said to be matched if there exists a one-to-one mapping of parameterized symbols onto itself such that it transforms one string into the other. Traditionally, to construct a full-text index for this problem, suffixes are transformed in a manner referred to as previous occurrence encoding. Although a space-efficient backward searching based data structure has been proposed recently, the data structure is specialized and complex owing to the nature of the encoding scheme. In this study, we demonstrate that a slight modification of the encoding scheme, by using ∞ instead of 0 to denote the first occurrences of each parameterized symbol, enables us to develop a much simpler FM-index for this problem, which only comprises two wavelet trees and one range maximum query index.