TREE COMPRESSION AND OPTIMIZATION WITH APPLICATIONS Dedicated to the memory of Markku Tamminen (1945-1989)

abstractly. Given a tree, the task is to map it as compactly as possible to memory, which is seen as a string of bits, a set of memory locations (words), or a set of memory blocks (pages). The range of the mapping depends on the application in question. In traditional tree compression the only operations performed are encoding of a tree to a bit string and decoding the bit string back to a tree. The benifits gained through data compression of large-to-very-large trees are obvious since compression reduces storage and data transfer requirements. On the other hand, there are some severe disadvantages of tree compression. Above all, compression makes all the normal tree operations (children, parent, search, delete, insert, etc.) more expensive. In most compression methods there is no other way to perform these operations other than to decode the compressed tree, carry out the operation, and encode the tree again! In the tree optimization problem , a term adopted from the work of Jacobson [4], the task is to maintain the functionality of a tree in the compressed form. That is, we want to perform some tree operations as efficiently as done in the uncompressed case (where an operation is a simple matter of pointer manipulation). We are mainly concerned with applications where the trees are manipulated in the internal memory of a computer. So, the mappings are to bit strings or memory words only. For applications concerning external memories, see for example [5,6]. The compression and optimization of trees is usually performed in two phases: the compression of the structure (

[1]  Kurt Mehlhorn,et al.  Data Structures and Algorithms 1: Sorting and Searching , 2011, EATCS Monographs on Theoretical Computer Science.

[2]  Jeffrey S. Salowe,et al.  Simplified Stable Merging Tasks , 1987, J. Algorithms.

[3]  Ian H. Witten,et al.  Arithmetic coding for data compression , 1987, CACM.

[4]  Guy Joseph Jacobson,et al.  Succinct static data structures , 1988 .

[5]  Jose Felipe Contla Compact coding of syntactically correct source programs , 1985, Softw. Pract. Exp..

[6]  Ellis Horowitz,et al.  Algorithms for trie compaction , 1984, TODS.

[7]  Eiji Kawaguchi,et al.  On a Method of Binary-Picture Representation and Its Application to Data Compression , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Erkki Mäkinen A linear time and space algorithm for finding isomorphic subtrees of a binary tree , 1991 .

[9]  Adil M.M. Al-Hussaini File compression using probabilistic grammars and LR parsing , 1982 .

[10]  Erkki Mäkinen,et al.  A note on the complexity of trie compaction , 1990, Bull. EATCS.

[11]  Douglas Comer,et al.  Analysis of a heuristic for full trie minimization , 1981, TODS.

[12]  Robert D. Cameron Source encoding using syntactic information source models , 1988, IEEE Trans. Inf. Theory.

[13]  Erkki Mäkinen,et al.  A Survey on Binary Tree Codings , 1991, Comput. J..

[14]  Markku Tamminen,et al.  Encoding pixel trees , 1984, Comput. Vis. Graph. Image Process..

[15]  Douglas Comer,et al.  The difficulty of optimum index selection , 1978, TODS.

[16]  Leonardo Felician,et al.  P-Compressed Quadtrees for Image Storing , 1988, Comput. J..

[17]  Daniel S. Hirschberg,et al.  Data compression , 1987, CSUR.

[18]  J. Ian Munro,et al.  Searchability in merging and implicit data structures , 1987, BIT Comput. Sci. Sect..

[19]  György Turán,et al.  On the succinct representation of graphs , 1984, Discret. Appl. Math..

[20]  James A. Storer,et al.  Data Compression: Methods and Theory , 1987 .

[21]  Sandra E. Hutchins Data Compression in Context-Free Languages , 1971, IFIP Congress.

[22]  Greg N. Frederickson,et al.  Implicit Data Structures for the Dictionary Problem , 1983, JACM.

[23]  Gaston H. Gonnet,et al.  Handbook Of Algorithms And Data Structures , 1984 .

[24]  Douglas Comer,et al.  Heuristics for trie index minimization , 1979, ACM Trans. Database Syst..

[25]  David Zerling,et al.  Generating binary trees using rotations , 1985, JACM.

[26]  Douglas Comer,et al.  The Complexity of Trie Index Construction , 1977, JACM.

[27]  Jorma Rissanen,et al.  Compression of Black-White Images with Arithmetic Coding , 1981, IEEE Trans. Commun..

[28]  Jukka Teuhola,et al.  Text compression using prediction , 1986, SIGIR '86.

[29]  J. Ian Munro,et al.  Implicit Data Structures for Fast Search and Update , 1980, J. Comput. Syst. Sci..

[30]  John T. Stasko,et al.  Pairing heaps: experiments and analysis , 1987, CACM.

[31]  Hanan Samet,et al.  Data structures for quadtree approximation and compression , 1985, CACM.

[32]  J. IAN MUNRO,et al.  An Implicit Data Structure Supporting Insertion, Deletion, and Search in O(log² n) Time , 1986, J. Comput. Syst. Sci..

[33]  Peter Elias,et al.  Efficient Storage and Retrieval by Content and Address of Static Files , 1974, JACM.

[34]  Robert E. Tarjan,et al.  Storing a sparse table , 1979, CACM.

[35]  Hanan Samet,et al.  The Quadtree and Related Hierarchical Data Structures , 1984, CSUR.

[36]  Irene Gargantini,et al.  An effective way to represent quadtrees , 1982, CACM.

[37]  K. Knowlton,et al.  Progressive transmission of grey-scale and binary pictures by simple, efficient, and lossless encoding schemes , 1980, Proceedings of the IEEE.

[38]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[39]  Gaston H. Gonnet,et al.  Mind Your Grammar: a New Approach to Modelling Text , 1987, VLDB.

[40]  J. Peter Kincaid,et al.  Variable-depth trie index optimization: theory and experimental results , 1989, TODS.

[41]  Derick Wood,et al.  Implicit Selection , 1988, SWAT.

[42]  Shmuel Zaks,et al.  Lexicographic Generation of Ordered Trees , 1980, Theor. Comput. Sci..

[43]  R. G. Stone On the Choice of Grammar and Parser for the Compact Analytical Encoding of Programs , 1986, Comput. J..

[44]  Martti Penttonen,et al.  Syntax‐directed compression of program files , 1986, Softw. Pract. Exp..

[45]  Jon Louis Bentley,et al.  Decomposable Searching Problems I: Static-to-Dynamic Transformation , 1980, J. Algorithms.

[46]  Erkki Mäkinen,et al.  Left distance binary tree representations , 1987, BIT.

[47]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[48]  Rudolf Bayer,et al.  Prefix B-trees , 1977, TODS.

[49]  Michael S. Landy,et al.  Hierarchical Coding of Binary Images , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Ronald C. Read,et al.  THE CODING OF VARIOUS KINDS OF UNLABELED TREES , 1972 .

[51]  Taylor L. Booth,et al.  ENCODING OF PROBABILISTIC CONTEXT-FREE LANGUAGES , 1971 .

[52]  Jens M. Dill Optimal Trie Compaction is NP-Complete , 1987 .

[53]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[54]  Erkki Mäkinen,et al.  On context-free derivations , 1985 .

[55]  Douglas W. Jones,et al.  Application of splay trees to data compression , 1988, CACM.