List update is a key step during the Burrows-Wheeler transform (BWT) compression. Previous work has shown that careful study of the list update step leads to better BWT compression. Surprisingly, the theoretical study of list update algorithms for compression has lagged behind its use in real practice. To be more precise, the standard model by Sleator and Tarjan for list update considers a 'linear cost-of-access' model while compression incurs a logarithmic cost of access, i.e. accessing item i in the list has cost Theta(i) in the standard model but Theta(log i) in compression applications. These models have been shown, in general, not to be equivalent. This paper has two contributions: (1) We give the first theoretical proof that the commonly used Move-To-Front (MTF) has good performance under the compression logarithmic cost-of-access model. This has long been known in practice but a formal proof under the logarithmic cost compression model was missing until now, (2) we further refine the online compression model to reflect its use under compression by applying the recently developed 'online algorithms with advice' model. This advice model was initially a purely theoretical construct in which the online algorithm has access to an all powerful oracle during the computation. We show that surprisingly, this seemingly unrealistic model can be used to produce better multi-pass compression algorithms. More precisely, we introduce an 'almost-online' list update algorithm, which we term BIB which results in a compression scheme which is superior to schemes using standard online algorithms, in particular those of MTF and TIMESTAMP. For example, for the files in the standard Canterbury Corpus, the compression ratio of the scheme that uses BIB is 33.66 on average, while the compression ratios for the schemes that use MTF and TIMESTAMP are respectively 34.25 and 36.30.
[1]
Alejandro López-Ortiz,et al.
An Application of Self-organizing Data Structures to Compression
,
2009,
SEA.
[2]
Christoph Ambühl.
Offline List Update is NP-Hard
,
2000,
ESA.
[3]
Alejandro López-Ortiz,et al.
On the list update problem with advice
,
2013,
Inf. Comput..
[4]
Robert E. Tarjan,et al.
A Locally Adaptive Data
,
1986
.
[5]
J. Ian Munro,et al.
On the Competitiveness of Linear Search
,
2000,
ESA.
[6]
Brenton Chapin.
Switching between two on-line list update algorithms for higher compression of Burrows-Wheeler transformed data
,
2000,
Proceedings DCC 2000. Data Compression Conference.
[7]
Jürgen Abel,et al.
Post BWT stages of the Burrows–Wheeler compression algorithm
,
2010,
Softw. Pract. Exp..
[8]
Susanne Albers,et al.
Improved randomized on-line algorithms for the list update problem
,
1995,
SODA '95.
[9]
Susanne Albers,et al.
Average Case Analyses of List Update Algorithms, with Applications to Data Compression
,
1996,
Algorithmica.
[10]
Robert E. Tarjan,et al.
Amortized efficiency of list update and paging rules
,
1985,
CACM.
[11]
Pierre Fraigniaud,et al.
Online computation with advice
,
2009,
Theor. Comput. Sci..
[12]
Conrado Martínez,et al.
On the competitiveness of the move-to-front rule
,
2000,
Theor. Comput. Sci..
[13]
Susanne Albers,et al.
On list update with locality of reference
,
2016,
J. Comput. Syst. Sci..
[14]
Martin Reinst.
On the Competitive Theory and Practice of Online List Accessing Algorithms
,
1998
.
[15]
Sandy Irani,et al.
Two Results on the List Update Problem
,
1991,
Inf. Process. Lett..
[16]
Jeffery R. Westbrook,et al.
Off-Line Algorithms for the List Update Problem
,
1996,
Inf. Process. Lett..