Optimal In-Place Suffix Sorting

Suffix array is a fundamental data structure for many applications that involve string searching and data compression. We obtain the \emph{first} linear time in-place suffix array construction algorithm which is optimal both in time and space for read-only integer alphabets. Our algorithm settles the open problem posed by [Franceschini and Muthukrishnan, ICALP'07]. The open problem asked to design in-place algorithms in o(n\log n) time and ultimately, in O(n) time for integer alphabets with |ς|≤ n. Our result is in fact slightly stronger since we allow |ς|=O(n). Besides, we extend it to obtain an optimal O(n\log n) time in-place suffix sorting algorithm for read-only general alphabets (i.e., only comparisons are allowed).

[1]  Jeffrey S. Salowe,et al.  Simplified Stable Merging Tasks , 1987, J. Algorithms.

[2]  Sen Zhang,et al.  Optimal Lightweight Construction of Suffix Arrays for Constant Alphabets , 2007, WADS.

[3]  Guy Jacobson,et al.  Space-efficient static trees and graphs , 1989, 30th Annual Symposium on Foundations of Computer Science.

[4]  Hozumi Tanaka,et al.  An efficient method for in memory construction of suffix arrays , 1999, 6th International Symposium on String Processing and Information Retrieval. 5th International Workshop on Groupware (Cat. No.PR00268).

[5]  Kunihiko Sadakane,et al.  A fast algorithm for making suffix arrays and for Burrows-Wheeler transformation , 1998, Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225).

[6]  Kunihiko Sadakane,et al.  Faster suffix sorting , 2007, Theoretical Computer Science.

[7]  Dong Kyue Kim,et al.  A Fast Algorithm for Constructing Suffix Arrays for Fixed-Size Alphabets , 2004, WEA.

[8]  Sen Zhang,et al.  Linear Time Suffix Array Construction Using D-Critical Substrings , 2009, CPM.

[9]  Sen Zhang,et al.  Two Efficient Algorithms for Linear Time Suffix Array Construction , 2011, IEEE Transactions on Computers.

[10]  Jens Stoye,et al.  An incomplex algorithm for fast suffix array construction , 2007, ALENEX/ANALCO.

[11]  Jian Li,et al.  Optimal In-Place Suffix Sorting , 2018, SPIRE.

[12]  Peter Sanders,et al.  Simple Linear Work Suffix Array Construction , 2003, ICALP.

[13]  Edward M. McCreight,et al.  A Space-Economical Suffix Tree Construction Algorithm , 1976, JACM.

[14]  Timothy M. Chan,et al.  Selection and Sorting in the "Restore" Model , 2014, SODA.

[15]  Simon J. Puglisi,et al.  An efficient, versatile approach to suffix sorting , 2008, JEAL.

[16]  Enno Ohlebusch,et al.  Replacing suffix trees with enhanced suffix arrays , 2004, J. Discrete Algorithms.

[17]  Juha Kärkkäinen,et al.  Fast Lightweight Suffix Array Construction and Checking , 2003, CPM.

[18]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[19]  Ge Nong,et al.  Practical linear-time O(1)-workspace suffix sorting for constant alphabets , 2013, TOIS.

[20]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[21]  David Richard Clark,et al.  Compact pat trees , 1998 .

[22]  Ge Nong,et al.  Linear Suffix Array Construction by Almost Pure Induced-Sorting , 2009, 2009 Data Compression Conference.

[23]  Eugene W. Myers,et al.  Suffix arrays: a new method for on-line string searches , 1993, SODA '90.

[24]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[25]  Gianni Franceschini,et al.  In-Place Suffix Sorting , 2007, ICALP.

[26]  William F. Smyth,et al.  A taxonomy of suffix array construction algorithms , 2007, CSUR.

[27]  Pang Ko,et al.  Linear Time Construction of Suffix Arrays , 2002 .

[28]  Roberto Grossi,et al.  Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching , 2005, SIAM J. Comput..

[29]  M. Farach Optimal suffix tree construction with large alphabets , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[30]  Giovanni Manzini,et al.  Opportunistic data structures with applications , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[31]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[32]  Peter Sanders,et al.  Linear work suffix array construction , 2006, JACM.

[33]  Giovanni Manzini,et al.  Engineering a Lightweight Suffix Array Construction Algorithm , 2002, ESA.

[34]  Simon J. Puglisi,et al.  Trends in Su x Sorting: A Survey of Low Memory Algorithms , 2012, ACSC.

[35]  Wing-Kai Hon,et al.  Breaking a Time-and-Space Barrier in Constructing Full-Text Indices , 2009, SIAM J. Comput..

[36]  Srinivas Aluru,et al.  Space efficient linear time construction of suffix arrays , 2005, J. Discrete Algorithms.

[37]  Travis Gagie,et al.  Lightweight Data Indexing and Compression in External Memory , 2009, Algorithmica.

[38]  Enno Ohlebusch,et al.  The Enhanced Suffix Array and Its Applications to Genome Analysis , 2002, WABI.

[39]  Yi Wu,et al.  Induced Sorting Suffixes in External Memory , 2015, TOIS.

[40]  Simon J. Puglisi,et al.  Faster Lightweight Suffix Array Construction , .

[41]  Yoram Bresler,et al.  Antisequential suffix sorting for BWT-based data compression , 2005, IEEE Transactions on Computers.