The cell probe complexity of dynamic data structures

Dynamic data structure problems involve the representation of data in memory in such a way as to permit certain types of modifications of the data (updates) and certain types of questions about the data (queries). This paradigm encompasses many fundamental problems in computer science. The purpose of this paper is to prove new lower and upper bounds on the time per operation to implement solutions to some familiar dynamic data structure problems including list representation, subset ranking, partial sums, and the set union problem. The main features of our lower bounds are:<list><item>They hold in the <italic>cell probe</italic> model of computation (A. Yao [18]) in which the time complexity of a sequential computation is defined to be the number of words of memory that are accessed. (The number of bits <italic>b</italic> in a single word of memory is a parameter of the model). All other computations are free. This model is at least as powerful as a random access machine and allows for unusual representation of data, indirect addressing etc. This contrasts with most previous lower bounds which are proved in models (e.g., algebraic, comparison, pointer manipulation) which require restrictions on the way data is represented and manipulated. </item><item>The lower bound method presented here can be used to derive amortized complexities, worst case per operation complexities, and randomized complexities. </item><item>The results occasionally provide (nearly tight) tradeoffs between the number <italic>R</italic> of words of memory that are read per operation, the number <italic>W</italic> of memory words rewritten per operation and the size <italic>b</italic> of each word. For the problems considered here there is a parameter <italic>n</italic> that represents the size of the data set being manipulated and for these problems <italic>b</italic> = log<italic>n</italic> is a natural register size to consider. By letting <italic>b</italic> vary, our results illustrate the effect of register size on time complexity. For instance, one consequence of the results is that for some of the problems considered here, increasing the register size from log<italic>n</italic> to polylog(<italic>n</italic>) only reduces the time complexity by a constant factor. On the other hand, decreasing the register size from log<italic>n</italic> to 1 increases time complexity by a log<italic>n</italic> factor for one of the problems we consider and only a loglog<italic>n</italic> factor for some other problems. </item></list> The first two specific data structure problems for which we obtain bounds are:<list><item>List Representation. This problem concerns the representation of an ordered list of at most <italic>n</italic> (not necessarily distinct) elements from the universe <italic>U</italic> = {1, 2,…, <italic>n</italic>}. The operations to be supported are report(<italic>k</italic>), which returns the <italic>k<supscrpt>th</supscrpt></italic> element of the list, insert(<italic>k</italic>, <italic>u</italic>) which inserts element <italic>u</italic> into the list between the elements in positions <italic>k</italic> - 1 and <italic>k</italic>, delete(<italic>k</italic>), which deletes the <italic>k<supscrpt>th</supscrpt></italic> item. </item><item>Subset Rank. This problem concerns the representation of a subset <italic>S</italic> of <italic>U</italic> = {1, 2,…, <italic>n</italic>}. The operations that must be supported are the updates “insert item <italic>j</italic> into the set” and “delete item <italic>j</italic> from the set” and the queries rank(<italic>j</italic>), which returns the number of elements in <italic>S</italic> that are less than or equal to <italic>j</italic>. </item></list> The natural word size for these problems is <italic>b</italic> = log<italic>n</italic>, which allows an item of <italic>U</italic> or an index into the list to be stored in one register. One simple solution to the list representation problem is to maintain a vector <italic>v</italic>, whose <italic>k<supscrpt>th</supscrpt></italic> entry contains the <italic>k<supscrpt>th</supscrpt></italic> item of the list. The report operation can be done in constant time, but the insert and delete operations may take time linear in the length of the list. Alternatively, one could store the items of the list with each element having a pointer to its predecessor and successor in the list. This allows for constant time updates (given a pointer to the appropriate location), but requires linear cost for queries. This problem can be solved must more efficiently by use of balanced trees (such as AVL trees). When <italic>b</italic> = log<italic>n</italic>, the worst case cost per operation using AVL trees is <italic>O</italic>(log<italic>n</italic>). If instead <italic>b</italic> = 1, so that each bit access costs 1, then the AVL three solution requires <italic>O</italic>(log<supscrpt>2</supscrpt><italic>n</italic>) per operation. It is not hard to find similar upper bounds for the subset rank problem (the algorithms for this problem are actually simpler than AVL trees). The question is: are these upper bounds bet possible? Our results show that the upper bounds for the case of log<italic>n</italic> bit registers are within a loglog<italic>n</italic> factor of optimal. On the other hand, somewhat surprisingly, for the case of single bit registers there are implementations for both of these problems that run in time significantly faster than <italic>O</italic>(log<supscrpt>2</supscrpt><italic>n</italic>) per operation. Let CPROBE(<italic>b</italic>) denote the cell probe computational model with register size <italic>b</italic>. Theorem 1. If <italic>b</italic> ≤ (log<italic>n</italic>)<supscrpt><italic>t</italic></supscrpt> for some <italic>t</italic>, then any CPROBE(<italic>b</italic>) implementation of either list representation or the subset rank requires &OHgr;(log<italic>n</italic>/loglog<italic>n</italic>) amortized time per operation. Theorem 2. Subset rank and list representation have CPROBE(1) implementations with respective complexities <italic>O</italic>((log<italic>n</italic>)(loglog<italic>n</italic>)) and <italic>O</italic>((log<italic>n</italic>)(loglog<italic>n</italic>)<supscrpt>2</supscrpt>) per operation. Paul Dietz (personal communication) has found an implementation of list representation with log<italic>n</italic> bit registers that requires only <italic>O</italic>(log<italic>n</italic>/loglog<italic>n</italic>) time per operation, and thus the result of theorem 1 is best possible. The lower bounds of theorem 1 are derived from lower bounds for a third problem:<list><item>Partial sum mode k. An array <italic>A</italic>[1],…, <italic>A</italic>[<italic>N</italic>] of integers mod <italic>k</italic> is to be represented. Updates are add(<italic>i</italic>, δ) which implements <italic>A</italic>[<italic>i</italic>] ← <italic>A</italic>[<italic>i</italic>] + δ; and queries are sum(j) which returns &Sgr;<subscrpt><italic>i</italic>≤<italic>j</italic></subscrpt><italic>A</italic>[<italic>i</italic>] (mod <italic>k</italic>). </item></list> This problem is demoted PS(n, k). Our main lower bound theorems provide tradeoffs between the number of register rewrites and register reads as a function of <italic>n</italic>, <italic>k</italic>, and <italic>b</italic>. Two corollaries of these results are:<list><item>Theorem 3. Any CPROBE(<italic>b</italic>) implementation of PS(n, 2) (partial sums mod 2) requires &OHgr;(log<italic>n</italic>/(loglog<italic>n</italic> + log<italic>b</italic>)) amortized time per operation, and for <italic>b</italic> ≥ log<italic>n</italic>, there is an implementation that achieves this. In particular, if <italic>b</italic> = &THgr;((log<italic>n</italic>)<supscrpt>c</supscrpt>) for some constant <italic>c</italic>, then the optimal time complexity of PS(n, 2) is &thgr;(log<italic>n</italic>/loglog<italic>n</italic>). </item><item>Theorem 4. Any CPROBE(1) implementation of PS(n, n) with single bit registers requires &OHgr;((log<italic>n</italic>/loglog<italic>n</italic>)<supscrpt>2</supscrpt>) amortized time per operation, and there is an implementation that achieves <italic>O</italic>(log<supscrpt>2</supscrpt><italic>n</italic>) time per operation. </item></list> It can be shown that a lower bound of PS(n, 2) is also a lower bound for both list representation and subset rank (the details, which are not difficult, are omitted from this report), and thus theorem 1 follows from theorem 3. The results of theorem 4 make an interesting contrast with those of theorem 2. For the three problems, list representation, subset rank and PS(n, k), there are standard algorithms that can be implemented on a CPROBE(log<italic>n</italic>) that use time <italic>O</italic>(log<italic>n</italic> per operation, and their implementations on CPROBE(1) require <italic>O</italic>(log<supscrpt>2</supscrpt><italic>n</italic>) time. Theorem 4 says that for the problem PS(n, n) this algorithm is essentially best possible, while theorem 2 says that for list representation and rank, the algorithm can be significantly improved. In fact, the rank problem an be viewed as a special case of PS(n, n) where the variables take on values on {0, 1}, and apparently this specialization is enough to reduce the complexity on a CPROBE(1) by a factor of log<italic>n</italic>/loglog<italic>n</italic>, even though on a CPROBE(log<italic>n</italic>) the complexities of the two problems differ by no more than a loglog<italic>n</italic> factor. The third problem we consider is the set union problem. This problem concerns the design of a data structure for the on-line manipulation of sets in the following setting. Initially, there are <italic>n</italic> singleton sets {1}, {2},…, {<italic>n</italic>} with <italic>i</italic> chosen as the name of the set {<italic>i</italic>}. Our data structure is required to implement two operations, Find(<italic>j</italic>), and Union(<italic>A</ital

[1]  Robert E. Tarjan,et al.  Efficiency of a Good But Not Linear Set Union Algorithm , 1972, JACM.

[2]  Miklós Ajtai,et al.  A lower bound for finding predecessors in Yao's cell probe model , 1988, Comb..

[3]  Dan E. Willard Log-Logarithmic Worst-Case Range Queries are Possible in Space Theta(N) , 1983, Inf. Process. Lett..

[4]  Norbert Blum On the Single-Operation Worst-Case Time Complexity on the Disjoint Set Union Problem , 1985, STACS.

[5]  Kurt Mehlhorn,et al.  A Lower Bound on the Complexity of the Union-Split-Find Problem , 1988, SIAM J. Comput..

[6]  Andrew Chi-Chih Yao On the Complexity of Maintaining Partial Sums , 1985, SIAM J. Comput..

[7]  Jan van Leeuwen,et al.  Worst-case Analysis of Set Union Algorithms , 1984, JACM.

[8]  János Komlós,et al.  Storing a sparse table with O(1) worst case access time , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[9]  Friedhelm Meyer auf der Heide,et al.  Dynamic perfect hashing: upper and lower bounds , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[10]  Allan Borodin,et al.  Efficient Searching Using Partial Ordering , 1981, Inf. Process. Lett..

[11]  Robert E. Tarjan,et al.  A linear-time algorithm for a special case of disjoint set union , 1983, J. Comput. Syst. Sci..

[12]  Michael L. Fredman,et al.  The Complexity of Maintaining an Array and Computing Its Partial Sums , 1982, JACM.

[13]  Robert E. Tarjan,et al.  A Class of Algorithms which Require Nonlinear Time to Maintain Disjoint Sets , 1979, J. Comput. Syst. Sci..

[14]  Andrew Chi-Chih Yao,et al.  Should Tables Be Sorted? , 1981, JACM.

[15]  Kurt Mehlhorn,et al.  A Lower Bound for the Complexity of the Union-Split-Find Problem , 1987, ICALP.

[16]  Robert E. Tarjan,et al.  Fast Algorithms for Finding Nearest Common Ancestors , 1984, SIAM J. Comput..

[17]  Harry G. Mairson Average case lower bounds on the construction and searching of partial orders , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).