k-Abelian Pattern Matching: Revisited, Corrected, and Extended

Two strings of equal length are called k-Abelian equivalent, if they share the same multi-set of factors of length at most k. Ehlers et al. [JDA, 2015] considered the k-Abelian pattern matching problem, where the task is to find all factors in a text T that are k-Abelian equivalent to a pattern P. They claimed a number of algorithmic results for the off-line and on-line versions of the k-Abelian pattern matching problem. In this paper, we first argue that some of the claimed results by Ehlers et al. [JDA, 2015] contain major errors, and then we present a new algorithm that correctly solves the offline version of the problem within the same bounds claimed by Ehlers et al., in O(n+m) time and O(m) space, where n = |T| and m = |P|. We also show how to correct errors in their online algorithm, and errors in their real-time algorithms for a slightly different problem called the extended k-Abelian pattern matching problem.

[1]  Juhani Karhumäki,et al.  Regularity of k-Abelian Equivalence Classes of Fixed Cardinality , 2018, Adventures Between Lower Bounds and Higher Altitudes.

[2]  Juhani Karhumäki,et al.  On a generalization of Abelian equivalence and complexity of infinite words , 2013, J. Comb. Theory, Ser. A.

[3]  Esko Ukkonen,et al.  On-line construction of suffix trees , 1995, Algorithmica.

[4]  Ely Porat,et al.  On the relationship between histogram indexing and block-mass indexing , 2014, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[5]  Michael A. Bender,et al.  The Level Ancestor Problem Simplified , 2002, LATIN.

[6]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[7]  Dong Kyue Kim,et al.  Constructing suffix arrays in linear time , 2005, J. Discrete Algorithms.

[8]  Peter Sanders,et al.  Linear work suffix array construction , 2006, JACM.

[9]  Sen Zhang,et al.  Two Efficient Algorithms for Linear Time Suffix Array Construction , 2011, IEEE Transactions on Computers.

[10]  Dan E. Willard,et al.  Log-logarithmic worst-case range queries are possible in space ⊕(N) , 1983 .

[11]  Srinivas Aluru,et al.  Space efficient linear time construction of suffix arrays , 2003, J. Discrete Algorithms.

[12]  Moshe Lewenstein,et al.  Weighted Ancestors in Suffix Trees , 2014, ESA.

[13]  S. Muthukrishnan,et al.  On the sorting-complexity of suffix tree construction , 2000, JACM.

[14]  Florin Manea,et al.  k-Abelian pattern matching , 2014, J. Discrete Algorithms.

[15]  Juhani Karhumäki,et al.  Fine and Wilf's Theorem for k-Abelian Periods , 2012, Int. J. Found. Comput. Sci..

[16]  Mohammad Sohel Rahman,et al.  Indexing permutations for binary strings , 2010, Inf. Process. Lett..

[17]  Gad M. Landau,et al.  Binary Jumbled Pattern Matching via All-Pairs Shortest Paths , 2014, ArXiv.

[18]  Faith Ellen,et al.  Optimal Bounds for the Predecessor Problem and Related Problems , 2002, J. Comput. Syst. Sci..

[19]  Zsuzsanna Lipták,et al.  Searching for Jumbled Patterns in Strings , 2009, Stringology.

[20]  Uwe Baier Linear-time Suffix Sorting - A New Approach for Suffix Array Construction , 2016, CPM.

[21]  Mohammad Sohel Rahman,et al.  Sub-quadratic time and linear space data structures for permutation matching in binary strings , 2012, J. Discrete Algorithms.

[22]  Zsuzsanna Lipták,et al.  Algorithms for Jumbled Pattern Matching in Strings , 2011, Int. J. Found. Comput. Sci..

[23]  Juhani Karhumäki,et al.  On the Unavoidability of k-Abelian Squares in Pure Morphic Words , 2013 .

[24]  Moshe Lewenstein,et al.  Clustered Integer 3SUM via Additive Combinatorics , 2015, STOC.

[25]  Moshe Lewenstein,et al.  Dynamic weighted ancestors , 2007, SODA '07.

[26]  Gad M. Landau,et al.  Scaled and permuted string matching , 2004, Inf. Process. Lett..

[27]  Hideo Bannai,et al.  Computing Abelian String Regularities Based on RLE , 2017, IWOCA.

[28]  Moshe Lewenstein,et al.  On Hardness of Jumbled Indexing , 2014, ICALP.

[29]  Wojciech Rytter,et al.  Efficient Indexes for Jumbled Pattern Matching with Constant-Sized Alphabet , 2013, ESA.

[30]  A. B. Cook Some unsolved problems. , 1952, Hospital management.

[31]  Michael A. Bender,et al.  The LCA Problem Revisited , 2000, LATIN.

[32]  Juhani Karhumäki,et al.  Local Squares, Periodicity and Finite Automata , 2011, Rainbow of Computer Science.

[33]  Peter van Emde Boas,et al.  Design and implementation of an efficient priority queue , 1976, Mathematical systems theory.

[34]  Aleksi Saarela,et al.  Strongly k-Abelian Repetitions , 2013, WORDS.

[35]  Juhani Karhumäki,et al.  On cardinalities of k-abelian equivalence classes , 2016, Theor. Comput. Sci..