Reachability queries appear very frequently in many important applications that work with graph structured data. In some of them, testing reachability between two nodes corresponds to an important problem. For example, in proteinprotein interaction networks one can use it to answer whether two proteins are related, whereas in ontological databases such queries might correspond to the question of whether a concept subsumes another one. Given the huge databases that are often tested with reachability queries, it is important problem to come up with a scalable indexing scheme that has almost constant query time. In this paper, we bring a new dimension to the well-known interval labeling approach. Our approach labels each node with multiple intervals instead of a single interval so that each labeling represents a hyper-rectangle. Our new approach BOX can index dags in linear time and space while retaining the querying time admissible. In experiments, we show that BOX is not vulnerable to increasing edge to node ratios which is a problem for the existing approaches.
[1]
H. V. Jagadish,et al.
A compression technique to materialize transitive closure
,
1990,
TODS.
[2]
Edith Cohen,et al.
Reachability and distance queries via 2-hop labels
,
2002,
SODA '02.
[3]
Gerhard Weikum,et al.
Efficient creation and incremental maintenance of the HOPI index for complex XML document collections
,
2005,
21st International Conference on Data Engineering (ICDE'05).
[4]
Yang Xiang,et al.
3-HOP: a high-compression indexing scheme for reachability query
,
2009,
SIGMOD Conference.
[5]
Philip S. Yu,et al.
Dual Labeling: Answering Graph Reachability Queries in Constant Time
,
2006,
22nd International Conference on Data Engineering (ICDE'06).
[6]
Ulf Leser,et al.
Fast and practical indexing and querying of very large graphs
,
2007,
SIGMOD '07.
[7]
Yang Xiang,et al.
Efficiently answering reachability queries on very large directed graphs
,
2008,
SIGMOD Conference.
[8]
Paul F. Dietz.
Maintaining order in a linked list
,
1982,
STOC '82.