Pattern-Avoiding Access in Binary Search Trees

The dynamic optimality conjecture is perhaps the most fundamental open question about binary search trees (BST). It postulates the existence of an asymptotically optimal online BST, i.e. one that is constant factor competitive with any BST on any input access sequence. The two main candidates for dynamic optimality in the literature are splay trees [Sleator and Tarjan, 1985], and GREEDY [Lucas, 1988; Munro, 2000; Demaine et al. 2009]. Despite BSTs being among the simplest data structures in computer science, and despite extensive effort over the past three decades, the conjecture remains elusive. Dynamic optimality is trivial for almost all sequences: the optimum access cost of most length-τ sequences is Θ(n log n), achievable by any balanced BST. Thus, the obvious missing step towards the conjecture is an understanding of the “easy” access sequences, and indeed the most fruitful research direction so far has been the study of specific sequences, whose “easiness” is captured by a parameter of interest. For instance, splay provably achieves the bound of O(nd) when d roughly measures the distances between consecutive accesses (dynamic finger), the average entropy (static optimality), or the delays between multiple accesses of an element (working set). The difficulty of proving dynamic optimality is witnessed by other highly restricted special cases that remain unresolved; one prominent example is the traversal conjecture [Sleator and Tarjan, 1985], which states that preorder sequences (whose optimum is linear) are linear-time accessed by splay trees; no online BST is known to satisfy this conjecture. In this paper, we prove two different relaxations of the traversal conjecture for GREEDY: (i) GREEDY is almost linear for preorder traversal, (ii) if a linear-time preprocessing1 is allowed, GREEDY is in fact linear. These statements are corollaries of our more general results that express the complexity of access sequences in terms of a pattern avoidance parameter k. Pattern avoidance is a well-established concept in combinatorics, and the classes of input sequences thus defined are rich, e.g. the k = 3 case includes preorder sequences. For any sequence X with parameter k, our most general result shows that GREEDY achieves the cost n2α(n)O(k) where α is the inverse Ackermann function. Furthermore, a broad subclass of parameter-k sequences has a natural combinatorial interpretation as k-decomposable sequences. For this class of inputs, we obtain an n2O(k2) bound for GREEDY when preprocessing is allowed. For k = 3, these results imply (i) and (ii). To our knowledge, these are the first upper bounds for GREEDY that are not known to hold for any other online BST. To obtain these results we identify an input-revealing property of GREEDY. Informally, this means that the execution log partially reveals the structure of the access sequence. This property facilitates the use of rich technical tools from forbidden submatrix theory. Further studying the intrinsic complexity of k-decomposable sequences, we make several observations. First, in order to obtain an offline optimal BST, it is enough to bound GREEDY on non-decomposable access sequences. Furthermore, we show that the optimal cost for k-decomposable sequences is Θ(n log k), which is well below the proven performance of all known BST algorithms. Hence, sequences in this class can be seen as a “candidate counterexample” to dynamic optimality.

[1]  Gabriel Nivasch,et al.  Improved bounds and new techniques for Davenport--Schinzel sequences and their generalizations , 2008, SODA.

[2]  Robert E. Tarjan,et al.  Sequential access in splay trees takes linear time , 1985, Comb..

[3]  George F. Georgakopoulos,et al.  Chain-splay trees, or, how to achieve and prove loglogN-competitiveness by splaying , 2008, Inf. Process. Lett..

[4]  J. Ian Munro,et al.  On the Competitiveness of Linear Search , 2000, ESA.

[5]  Donald E. Knuth,et al.  The Art of Computer Programming: Volume IV: Fascicle 2: Generating All Tuples and Permutations , 2005 .

[6]  Martin Lackner,et al.  The computational landscape of permutation patterns , 2013, ArXiv.

[7]  Zoltán Füredi,et al.  The maximum number of unit distances in a convex n-gon , 1990, J. Comb. Theory, Ser. A.

[8]  Seth Pettie,et al.  Splay trees, Davenport-Schinzel sequences, and the deque conjecture , 2007, SODA '08.

[9]  Gábor Tardos,et al.  Excluded permutation matrices and the Stanley-Wilf conjecture , 2004, J. Comb. Theory, Ser. A.

[10]  Sergey Kitaev,et al.  Patterns in Permutations and Words , 2011, Monographs in Theoretical Computer Science. An EATCS Series.

[11]  Robert E. Tarjan,et al.  Self-adjusting binary search trees , 1985, JACM.

[12]  Kurt Mehlhorn,et al.  Nearly optimal binary search trees , 1975, Acta Informatica.

[13]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[14]  Daniel Dominic Sleator,et al.  O(log log n)-competitive dynamic binary search trees , 2006, SODA '06.

[15]  R. Brignall Permutation Patterns: A survey of simple permutations , 2008, 0801.0963.

[16]  Richard Cole,et al.  On the Dynamic Finger Conjecture for Splay Trees. Part I: Splay Sorting log n-Block Sequences , 1995, SIAM J. Comput..

[17]  Daniel M. Kane,et al.  The geometry of binary search trees , 2009, SODA.

[18]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[19]  Seth Pettie Sharp Bounds on Formation-free Sequences , 2015, SODA.

[20]  Seth Pettie,et al.  Applications of forbidden 0-1 matrices to search tree and path compression-based data structures , 2010, SODA '10.

[21]  Erik D. Demaine,et al.  Dynamic Optimality - Almost , 2004, FOCS.

[22]  Richard Cole,et al.  On the Dynamic Finger Conjecture for Splay Trees. Part II: The Proof , 2000, SIAM J. Comput..

[23]  R. Chaudhuri,et al.  Splaying a search tree in preorder takes linear time , 1993, SIGA.

[24]  Erik D. Demaine,et al.  New bounds on optimal binary search trees , 2006 .

[25]  Kurt Mehlhorn,et al.  Self-Adjusting Binary Search Trees: What Makes Them Tick? , 2015, ESA.

[26]  John Iacono,et al.  In Pursuit of the Dynamic Optimality Conjecture , 2013, Space-Efficient Data Structures, Streams, and Algorithms.

[27]  Robert E. Tarjan,et al.  Sorting Using Networks of Queues and Stacks , 1972, J. ACM.

[28]  Kyle Fox,et al.  Upper Bounds for Maximally Greedy Binary Search Trees , 2011, WADS.

[29]  Erik D. Demaine,et al.  Combining Binary Search Trees , 2013, ICALP.

[30]  Prosenjit Bose,et al.  Pattern Matching for Permutations , 1993, WADS.

[31]  Jacob Fox,et al.  Stanley-Wilf limits are typically exponential , 2013, ArXiv.

[32]  Ervin Györi,et al.  An Extremal Problem on Sparse 0-1 Matrices , 1991, SIAM J. Discret. Math..

[33]  Vaughan R. Pratt,et al.  Computing permutations with double-ended queues, parallel stacks and parallel queues , 1973, STOC.

[34]  Vincent Vatter,et al.  Permutation classes , 2014, 1409.5159.

[35]  Kurt Mehlhorn,et al.  Greedy Is an Almost Optimal Deque , 2015, WADS.

[36]  Robert E. Wilber Lower Bounds for Accessing Binary Search Trees with Rotations , 1989, SIAM J. Comput..