A compact index for order‐preserving pattern matching

Order‐preserving pattern matching has been introduced recently, but it has already attracted much attention. Given a reference sequence and a pattern, we want to locate all substrings of the reference sequence whose elements have the same relative order as the pattern elements. For this problem, we consider the offline version in which we build an index for the reference sequence so that subsequent searches can be completed very efficiently. We propose a space‐efficient index that works well in practice despite its lack of good worst‐case time bounds. Our solution is based on the new approach of decomposing the indexed sequence into an order component, containing ordering information, and a δ component, containing information on the absolute values. Experiments show that this approach is viable, is faster than the available alternatives, and is the first one offering simultaneously small space usage and fast retrieval.

[1]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[2]  Alistair Moffat,et al.  From Theory to Practice: Plug and Play with Succinct Data Structures , 2013, SEA.

[3]  Travis Gagie,et al.  A Compact Index for Order-Preserving Pattern Matching , 2017, 2017 Data Compression Conference (DCC).

[4]  Rudolf Fleischer,et al.  Order Preserving Matching , 2013, Theor. Comput. Sci..

[5]  Giovanni Manzini,et al.  An experimental study of a compressed index , 2001, Inf. Sci..

[6]  Gonzalo Navarro,et al.  Optimal-Time Text Indexing in BWT-runs Bounded Space , 2017, SODA.

[7]  Gonzalo Navarro,et al.  Compressed full-text indexes , 2007, CSUR.

[8]  Giovanni Manzini,et al.  A simple and fast DNA compressor , 2004, Softw. Pract. Exp..

[9]  Wojciech Rytter,et al.  A linear time algorithm for consecutive permutation pattern matching , 2013, Inf. Process. Lett..

[10]  Joong Chae Na,et al.  A fast algorithm for order-preserving pattern matching , 2015, Inf. Process. Lett..

[11]  Jorma Tarhio,et al.  A filtration method for order-preserving matching , 2016, Inf. Process. Lett..

[12]  Jorma Tarhio,et al.  Engineering order‐preserving pattern matching with SIMD parallelism , 2017, Softw. Pract. Exp..

[13]  Domenico Cantone,et al.  An Efficient Skip-Search Approach to the Order-Preserving Pattern Matching Problem , 2015, Stringology.

[14]  Pawel Gawrychowski,et al.  Order-Preserving Pattern Matching with k Mismatches , 2014, CPM.

[15]  Travis Gagie,et al.  An Encoding for Order-Preserving Matching , 2016, ESA.

[16]  Mathieu Raffinot,et al.  Single and Multiple Consecutive Permutation Motif Search , 2013, ISAAC.

[17]  Simone Faro,et al.  Efficient Algorithms for the Order Preserving Pattern Matching Problem , 2016, AAIM.

[18]  Jorma Tarhio,et al.  Filtration Algorithms for Approximate Order-Preserving Matching , 2015, SPIRE.

[19]  Jorma Tarhio,et al.  Order-Preserving Matching with Filtration , 2014, SEA.

[20]  Meng He,et al.  Indexing Compressed Text , 2003 .

[21]  Wojciech Rytter,et al.  Order-preserving indexing , 2016, Theor. Comput. Sci..

[22]  Jorma Tarhio,et al.  Alternative Algorithms for Order-Preserving Matching , 2015, Stringology.

[23]  Wojciech Rytter,et al.  Order-Preserving Incomplete Suffix Trees and Order-Preserving Indexes , 2013, SPIRE.