Union-find with deletions

In the classical union-find problem we maintain a partition of a universe of <i>n</i> elements into disjoint sets subject to the operations union and find. The operation <i>union</i>(<i>A, B, C</i>) replaces sets <i>A</i> and <i>B</i> in the partition by their union, given the name <i>C.</i> The operation <i>find</i>(<i>x</i>) returns the name of the set containing the element <i>x.</i> In this paper we revisit the union-find problem in a context where the underlying partitioned universe is not fixed. Specifically, we allow a <i>delete</i>(<i>x</i>) operation which removes the element <i>x</i> from the set containing it. We consider both worst-case performance and amortized performance. In both settings the challenge is to dynamically keep the size of the structure representing each set proportional to the number of elements in the set which may now decrease as a result of deletions.For any fixed <i>k,</i> we describe a data structure that supports find and delete in <i>O</i>(log<inf><i>k</i></inf><i>n</i>) worst-case time and union in <i>O</i>(<i>k</i>) worst-case time. This matches the best possible worst-case bounds for find and union in the classical setting. Furthermore, using an incremental global rebuilding technique we obtain a reduction converting any union-find data structure to a union-find with deletions data structure. Our reduction is such that the time bounds for find and union change only by a constant factor. The time it takes to delete an element <i>x</i> is the same as the time it takes to find the set containing <i>x</i> plus the time it takes to unite a singleton set with this set.In an amortized setting a classical data structure of Tarjan supports a sequence of <i>m</i> finds and at most <i>n</i> unions on a universe of <i>n</i> elements in <i>O</i>(<i>n + m</i>α(<i>m + n, n,</i> log <i>n</i>)) time where α(<i>m, n, l</i>) = min{<i>k</i> | <i>A<inf>k</inf></i>(⌊<i>m/n</i>⌋) > <i>l</i>} and <i>A<inf>i</inf></i>(<i>j</i>) is Ackermann's function as described in [6]. We refine the analysis of this data structure and show that in fact the cost of each find is proportional to the size of the corresponding set. Specifically, we show that one can pay for a sequence of union and find operations by charging a constant to each participating element and <i>O</i>(α(<i>m, n,</i> log(<i>l</i>))) for a find of an element in a set of size <i>l</i>. We also show how keep these amortized costs for each find and each participating element while allowing deletions. The amortized cost of deleting an element from a set of <i>l</i> elements is the same as the amortized cost of finding the element; namely, <i>O</i>(α(<i>m, n,</i> log(<i>l</i>))).