Compact Labeling Scheme for Ancestor Queries

We consider the following problem. Given a rooted tree $T$, label the nodes of $T$ in the most compact way such that, given the labels of two nodes $u$ and $v$, one can determine in constant time, by looking only at the labels, whether $u$ is ancestor of $v$. The best known labeling scheme is rather straightforward and uses labels of length at most $2\log_2 n$ bits each, where $n$ is the number of nodes in the tree. Our main result in this paper is a labeling scheme with maximum label length $\log_2 n + \Oh(\sqrt{\log n})$. Our motivation for studying this problem is enhancing the performance of web search engines. In the context of this application each indexed document is a tree, and the labels of all trees are maintained in main memory. Therefore even small improvements in the maximum label length are important.

[1]  David Peleg,et al.  Compact and localized distributed data structures , 2003, Distributed Computing.

[2]  Haim Kaplan,et al.  Compact labeling schemes for ancestor queries , 2001, SODA '01.

[3]  Anna van Raaphorst OASIS (Organization for the Advancement of Structured Information Standards) , 2006 .

[4]  E. F. Moore,et al.  Variable-length binary encodings , 1959 .

[5]  Dan Suciu,et al.  Data on the Web: From Relations to Semistructured Data and XML , 1999 .

[6]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[7]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[8]  Mikkel Thorup,et al.  Minimizing Diameters of Dynamic Trees , 1997, ICALP.

[9]  David Peleg,et al.  Distance labeling schemes for well-separated graph classes , 2000, Discret. Appl. Math..

[10]  Haim Kaplan,et al.  A comparison of labeling schemes for ancestor queries , 2002, SODA '02.

[11]  Greg N. Frederickson Ambivalent Data Structures for Dynamic 2-Edge-Connectivity and k Smallest Spanning Trees , 1997, SIAM J. Comput..

[12]  Philip Bille,et al.  Labeling schemes for small distances in trees , 2003, SODA '03.

[13]  Nicola Santoro,et al.  Labelling and Implicit Routing in Networks , 1985, Computer/law journal.

[14]  Melvin A. Breuer,et al.  Coding the vertexes of a graph , 1966, IEEE Trans. Inf. Theory.

[15]  Moni Naor,et al.  Implicit representation of graphs , 1992, STOC '88.

[16]  Kurt Mehlhorn,et al.  A Best Possible Bound for the Weighted Path Length of Binary Search Trees , 1977, SIAM J. Comput..

[17]  Ran Raz,et al.  Distance labeling in graphs , 2001, SODA '01.

[18]  Haim Kaplan,et al.  Short and Simple Labels for Small Distances and Other Functions , 2001, WADS.

[19]  Athanasios K. Tsakalidis Maintaining order in a generalized linked list , 2004, Acta Informatica.

[20]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[21]  David Peleg,et al.  Proximity-preserving labeling schemes , 2000, J. Graph Theory.

[22]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[23]  David Peleg,et al.  Labeling schemes for flow and connectivity , 2002, SODA '02.

[24]  Stephen Alstrup,et al.  Nearest Common Ancestors: A Survey and a New Algorithm for a Distributed Environment , 2004, Theory of Computing Systems.

[25]  Melvin A. Breuer,et al.  An unexpected result in coding the vertices of a graph , 1967 .