On Indeterminate Strings Matching

Given two indeterminate equal-length strings p and t with a set of characters per position in both strings, we obtain a determinate string p_w from p and a determinate string t_w from t by choosing one character per position. Then, we say that p and t match when p_w and t_w match for some choice of the characters. While the most standard notion of a match for determinate strings is that they are simply identical, in certain applications it is more appropriate to use other definitions, with the prime examples being parameterized matching, order-preserving matching, and the recently introduced Cartesian tree matching. We provide a systematic study of the complexity of string matching for indeterminate equal-length strings, for different notions of matching. We use n to denote the length of both strings, and r to be an upper-bound on the number of uncertain characters per position. First, we provide the first polynomial time algorithm for the Cartesian tree version that runs in deterministic 𝒪(nlog² n) and expected 𝒪(nlog nlog log n) time using 𝒪(nlog n) space, for constant r. Second, we establish NP-hardness of the order-preserving version for r=2, thus solving a question explicitly stated by Henriques et al. [CPM 2018], who showed hardness for r=3. Third, we establish NP-hardness of the parameterized version for r=2. As both parameterized and order-preserving indeterminate matching reduce to the standard determinate matching for r=1, this provides a complete classification for these three variants.

[1]  Philip Bille,et al.  String matching with variable length gaps , 2012, Theor. Comput. Sci..

[2]  Rudolf Fleischer,et al.  Order Preserving Matching , 2013, Theor. Comput. Sci..

[3]  Solon P. Pissis,et al.  Even Faster Elastic-Degenerate String Matching via Fast Matrix Multiplication , 2019, ICALP.

[4]  Shu Wang,et al.  Indeterminate strings, prefix arrays & undirected graphs , 2014, Theor. Comput. Sci..

[5]  Gad M. Landau,et al.  Cartesian Tree Matching and Indexing , 2019, CPM.

[6]  Wojciech Rytter,et al.  Covering problems for partial words and for indeterminate strings , 2017, Theor. Comput. Sci..

[7]  Luís M. S. Russo,et al.  Order-Preserving Pattern Matching Indeterminate Strings , 2018, CPM.

[8]  S. Muthukrishnan,et al.  Alphabet Dependence in Parameterized Matching , 1994, Inf. Process. Lett..

[9]  Pawel Gawrychowski,et al.  Order-Preserving Pattern Matching with k Mismatches , 2014, CPM.

[10]  Wojciech Rytter,et al.  A linear time algorithm for consecutive permutation pattern matching , 2013, Inf. Process. Lett..

[11]  Mohammad Sohel Rahman,et al.  Enhanced Covers of Regular and Indeterminate Strings Using Prefix Tables , 2015, J. Autom. Lang. Comb..

[12]  Kunsoo Park,et al.  Fast Multiple Pattern Cartesian Tree Matching , 2019, WALCOM.

[13]  Joong Chae Na,et al.  A fast algorithm for order-preserving pattern matching , 2015, Inf. Process. Lett..

[14]  Costas S. Iliopoulos,et al.  Efficient Pattern Matching in Elastic-Degenerate Strings , 2016, Inf. Comput..

[15]  Wojciech Rytter,et al.  String Periods in the Order-Preserving Model , 2018, STACS.

[16]  Gad M. Landau,et al.  Finding Periods in Cartesian Tree Matching , 2019, IWOCA.

[17]  Moshe Lewenstein,et al.  Parameterized matching with mismatches , 2007, J. Discrete Algorithms.

[18]  Michael Soltys,et al.  An improved upper bound and algorithm for clique covers , 2018, J. Discrete Algorithms.

[19]  Brenda S. Baker,et al.  A theory of parameterized pattern matching: algorithms and applications , 1993, STOC.

[20]  Wojciech Rytter,et al.  Covering Problems for Partial Words and for Indeterminate Strings , 2014, ISAAC.

[21]  Dan E. Willard,et al.  Log-logarithmic worst-case range queries are possible in space ⊕(N) , 1983 .

[22]  William F. Smyth,et al.  Algorithms on indeterminate strings , 2003 .

[23]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[24]  Kunsoo Park,et al.  Fast Cartesian Tree Matching , 2019, SPIRE.

[25]  Shu Wang,et al.  Fast pattern-matching on indeterminate strings , 2008, J. Discrete Algorithms.

[26]  Zsuzsanna Lipták,et al.  Algorithms for Jumbled Pattern Matching in Strings , 2011, Int. J. Found. Comput. Sci..

[27]  Arnaud Lefebvre,et al.  Efficient pattern matching in degenerate strings with the Burrows-Wheeler transform , 2019, Inf. Process. Lett..

[28]  Brenda S. Baker Parameterized Pattern Matching: Algorithms and Applications , 1996, J. Comput. Syst. Sci..

[29]  B Zeidman Software v. Software , 2010, IEEE Spectrum.

[30]  Jean Vuillemin,et al.  A unifying look at data structures , 1980, CACM.

[31]  Costas S. Iliopoulos,et al.  Pattern Matching Algorithms with Don't Cares , 2007, SOFSEM.

[32]  Fouad B. Chedid On Pattern Matching With Swaps , 2013, 2013 ACS International Conference on Computer Systems and Applications (AICCSA).

[33]  Moshe Lewenstein,et al.  Approximate parameterized matching , 2007, TALG.

[34]  William F. Smyth,et al.  Constructing an indeterminate string from its associated graph , 2018, Theor. Comput. Sci..