An Optimal Ancestry Labeling Scheme with Applications to XML Trees and Universal Posets

In this article, we solve the ancestry-labeling scheme problem, which aims at assigning the shortest possible labels (bit strings) to nodes of rooted trees, so ancestry queries between any two nodes can be answered by inspecting their assigned labels only. This problem was introduced more than 20 years ago by Kannan et al. [1988] and is among the most well-studied problems in the field of informative labeling schemes. We construct an ancestry-labeling scheme for n-node trees with label size log 2n + O(log log n) bits, thus matching the log 2n + Ω(log log n) bits lower bound given by Alstrup et al. [2003]. Our scheme is based on a simplified ancestry scheme that operates extremely well on a restricted set of trees. In particular, for the set of n-node trees with a depth of at most d, the simplified ancestry scheme enjoys label size of log 2n + 2log2d + O(1) bits. Since the depth of most XML trees is at most some small constant, such an ancestry scheme may be of practical use. In addition, we also obtain an adjacency-labeling scheme that labels n-node trees of depth d with labels of size log 2n + 3log 2d + O(1) bits. All our schemes assign the labels in linear time, and guarantee that any query can be answered in constant time. Finally, our ancestry scheme finds applications to the construction of small universal partially ordered sets (posets). Specifically, for any fixed integer k, it enables the construction of a universal poset of size Õ(nk) for the family of n-element posets with a tree dimension of at most k. Up to lower-order terms, this bound is tight thanks to a lower bound of nk − o(1) by to Alon and Scheinerman [1988].

[1]  Mikkel Thorup,et al.  Adjacency Labeling Schemes and Induced-Universal Graphs , 2014, STOC.

[2]  Pierre Fraigniaud,et al.  Routing in Trees , 2001, ICALP.

[3]  Haim Kaplan,et al.  Compact labeling schemes for ancestor queries , 2001, SODA '01.

[4]  Mathias Bæk Tejs Knudsen,et al.  A Simple and Optimal Ancestry Labeling Scheme for Trees , 2015, ICALP.

[5]  Ludovic Denoyer,et al.  The Wikipedia XML Corpus , 2006, INEX.

[6]  Hsueh-I Lu,et al.  An Optimal Labeling for Node Connectivity , 2009, ISAAC.

[7]  David Peleg,et al.  Constructing Labeling Schemes Through Universal Matrices , 2006, ISAAC.

[8]  Nicolás Marín,et al.  Review of Data on the Web: from relational to semistructured data and XML by Serge Abiteboul, Peter Buneman, and Dan Suciu. Morgan Kaufmann 1999. , 2003, SGMD.

[9]  Haim Kaplan,et al.  A comparison of labeling schemes for ancestor queries , 2002, SODA '02.

[10]  Stephen Alstrup,et al.  Nearest Common Ancestors: A Survey and a New Algorithm for a Distributed Environment , 2004, Theory of Computing Systems.

[11]  David Adjiashvili,et al.  Labeling Schemes for Bounded Degree Graphs , 2014, ICALP.

[12]  Philip Bille,et al.  Labeling schemes for small distances in trees , 2003, SODA '03.

[13]  Denilson Barbosa,et al.  Studying the XML Web: Gathering Statistics from an XML Sample , 2006, World Wide Web.

[14]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[15]  Stephen Alstrup,et al.  Compact Labeling Scheme for Ancestor Queries , 2006, SIAM J. Comput..

[16]  Gerhard Behrendt On trees and tree dimension of ordered sets , 1993 .

[17]  Mathias Bæk Tejs Knudsen,et al.  Improved ancestry labeling scheme for trees , 2014, ArXiv.

[18]  E. S. Wolk The comparability graph of a tree , 1962 .

[19]  Denilson Barbosa,et al.  Studying the XML Web: Gathering Statistics from an XML Sample , 2005, World Wide Web.

[20]  Stephen Alstrup,et al.  Nearest common ancestors: a survey and a new distributed algorithm , 2002, SPAA.

[21]  Cyril Gavoille,et al.  Split Decomposition and Distance Labelling: An Optimal Scheme For Distance Hereditary Graphs , 2001, Electron. Notes Discret. Math..

[22]  Jaroslav Nesetril,et al.  Universal partial order represented by means of oriented trees and other simple graphs , 2005, Eur. J. Comb..

[23]  Shiri Chechik,et al.  Compact Routing Schemes , 2016, Encyclopedia of Algorithms.

[24]  Edith Cohen,et al.  Labeling dynamic XML trees , 2002, PODS '02.

[25]  Pierre Fraigniaud,et al.  An optimal ancestry scheme and small universal posets , 2010, STOC '10.

[26]  Pierre Fraigniaud,et al.  Compact ancestry labeling schemes for XML trees , 2010, SODA '10.

[27]  Mathias Bæk Tejs Knudsen,et al.  Optimal Induced Universal Graphs and Adjacency Labeling for Trees , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[28]  Robert E. Tarjan,et al.  A data structure for dynamic trees , 1981, STOC '81.

[29]  Acknowledgments , 2006, Molecular and Cellular Endocrinology.

[30]  Moni Naor,et al.  Implicit representation of graphs , 1992, STOC '88.

[31]  Stephen Alstrup,et al.  Small induced-universal graphs and compact implicit graph representations , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[32]  Stephen Alstrup,et al.  Improved labeling scheme for ancestor queries , 2002, SODA '02.

[33]  Ran Raz,et al.  Distance labeling in graphs , 2001, SODA '01.

[34]  Shay Kutten,et al.  Distributed verification of minimum spanning trees , 2006, PODC '06.

[35]  Ludovic Denoyer,et al.  The XML Wikipedia Corpus , 2006 .

[36]  Cyril Gavoille,et al.  Shorter Implicit Representation for Planar Graphs and Bounded Treewidth Graphs , 2007, ESA.

[37]  Irena Holubová,et al.  Statistical Analysis of Real XML Data Collections , 2006, COMAD.

[38]  B. Jónsson Universal relational systems , 1956 .

[39]  Dan Suciu,et al.  Journal of the ACM , 2006 .

[40]  Mikkel Thorup Compact oracles for reachability and approximate distances in planar digraphs , 2004, JACM.

[41]  Noga Alon,et al.  Degrees of freedom versus dimension for containment orders , 1988 .

[42]  Amos Korman,et al.  Labeling schemes for vertex connectivity , 2007, TALG.

[43]  Haim Kaplan,et al.  Short and Simple Labels for Small Distances and Other Functions , 2001, WADS.

[44]  Pierre Fraigniaud,et al.  On randomized representations of graphs using short labels , 2009, SPAA '09.

[45]  W. Trotter,et al.  Combinatorics and Partially Ordered Sets: Dimension Theory , 1992 .

[46]  Stephen Alstrup,et al.  Near-optimal labeling schemes for nearest common ancestors , 2013, SODA.

[47]  Dan Suciu,et al.  Data on the Web: From Relations to Semistructured Data and XML , 1999 .

[48]  David Peleg,et al.  Labeling schemes for flow and connectivity , 2002, SODA '02.

[49]  John B. Johnston,et al.  Universal infinite partially ordered sets , 1956 .

[50]  David Peleg,et al.  Informative labeling schemes for graphs , 2000, Theor. Comput. Sci..

[51]  William T. Trotter,et al.  The dimension of planar posets , 1977, J. Comb. Theory, Ser. B.