Indexing the Bijective BWT

The Burrows-Wheeler transform (BWT) is a permutation whose applications are prevalent in data compression and text indexing. The bijective BWT is a bijective variant of it that has not yet been studied for text indexing applications. We fill this gap by proposing a self-index built on the bijective BWT . The self-index applies the backward search technique of the FM-index to find a pattern P with O(|P| lg|P|) backward search steps.

[1]  Antonio Restivo,et al.  An extension of the Burrows-Wheeler Transform , 2007, Theor. Comput. Sci..

[2]  Joseph Gil,et al.  A Bijective String Sorting Transform , 2012, ArXiv.

[3]  Gonzalo Navarro,et al.  Optimal Dynamic Sequence Representations , 2013, SODA.

[4]  Stefan Böttcher,et al.  Fast Insertion and Deletion in Compressed Texts , 2012, 2012 Data Compression Conference.

[5]  Manfred Kufleitner On Bijective Variants of the Burrows-Wheeler Transform , 2009, Stringology.

[6]  Giovanni Manzini,et al.  Opportunistic data structures with applications , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[7]  Amar Mukherjee,et al.  The Burrows-Wheeler Transform:: Data Compression, Suffix Arrays, and Pattern Matching , 2008 .

[8]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[9]  Antonio Restivo,et al.  Sorting conjugates and Suffixes of Words in a Multiset , 2014, Int. J. Found. Comput. Sci..

[10]  Gonzalo Navarro,et al.  Compressed full-text indexes , 2007, CSUR.

[11]  Hideo Bannai,et al.  Faster Lyndon factorization algorithms for SLP and LZ78 compressed text , 2016, Theor. Comput. Sci..

[12]  Jean Pierre Duval,et al.  Factorizing Words over an Ordered Alphabet , 1983, J. Algorithms.

[13]  Alberto Policriti,et al.  Computing LZ77 in Run-Compressed Space , 2015, 2016 Data Compression Conference (DCC).

[14]  R. Lyndon On Burnside’s problem , 1954 .

[15]  Gonzalo Navarro,et al.  Optimal-Time Text Indexing in BWT-runs Bounded Space , 2017, SODA.

[16]  Gonzalo Navarro,et al.  Succinct Suffix Arrays based on Run-Length Encoding , 2005, Nord. J. Comput..

[17]  Uwe Baier,et al.  On Undetected Redundancy in the Burrows-Wheeler Transform , 2018, CPM.

[18]  Hiroshi Sakamoto,et al.  A Faster Implementation of Online Run-Length Burrows-Wheeler Transform , 2017, IWOCA.

[19]  Paolo Ferragina,et al.  Indexing compressed text , 2005, JACM.

[20]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[21]  Hiroshi Inoue,et al.  Fragmented BWT: An Extended BWT for Full-Text Indexing , 2016, SPIRE.

[22]  R. Lyndon,et al.  Free Differential Calculus, IV. The Quotient Groups of the Lower Central Series , 1958 .

[23]  Stefan Böttcher,et al.  Implementing Efficient Updates in Compressed Big Text Databases , 2013, DEXA.