A performance study of transitive closure algorithms

We present a comprehensive performance evaluation of transitive closure (reachability) algorithms for databases. The study is based upon careful implementations of the algorithms, measures page I/O, and covers algorithms for full transitive closure as well as partial transitive closure (finding all successors of each node in a set of given source nodes). We examine a wide range of acyclic graphs with varying density and “locality” of arcs in the graph. We also consider query parameters such as the selectivity of the query, and system parameters such as the buffer size and the page and successor list replacement policies. We show that significant cost tradeoffs exist between the algorithms in this spectrum and identify the factors that influence the performance of the algorithms. An important aspect of our work is that we measure a number of different cost metrics, giving us a good understanding of the predictive power of these metrics with respect to I/O cost. This is especially significant since metrics such as number of tuples generated or number of successor list operations have been widely used to compare transitive closure algorithms in the literature. Our results strongly suggest that these other metrics cannot be reliability used to predict I/O cost of transitive closure evaluation.

[1]  Bin Jiang,et al.  A suitable algorithm for computing partial transitive closures in databases , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[2]  H. V. Jagadish,et al.  A spanning tree transitive closure algorithm , 1992, [1992] Eighth International Conference on Data Engineering.

[3]  Raghu Ramakrishnan,et al.  Transitive closure algorithms based on graph traversal , 1993, TODS.

[4]  Håkan Jakobsson,et al.  On tree-based techniques for query evaluation , 1992, PODS.

[5]  H. V. Jagadish,et al.  Direct transitive closure algorithms: design and performance evaluation , 1990, TODS.

[6]  Jürgen Ebert,et al.  A Sensitive Transitive Closure Algorithm , 1981, Inf. Process. Lett..

[7]  Catriel Beeri,et al.  On the power of magic , 1987, J. Log. Program..

[8]  Mihalis Yannakakis,et al.  The input/output complexity of transitive closure , 1990, SIGMOD '90.

[9]  H. V. Jagadish,et al.  Hybrid Transitive Closure Algorithms , 1990, VLDB.

[10]  Jeffrey F. Naughton,et al.  Efficient evaluation of right-, left-, and multi-linear rules , 1989, SIGMOD '89.

[11]  Henry S. Warren,et al.  A modification of Warshall's algorithm for the transitive closure of binary relations , 1975, Commun. ACM.

[12]  Claus-Peter Schnorr,et al.  An Algorithm for Transitive Closure with Linear Expected Time , 1978, SIAM J. Comput..

[13]  Stephen Warshall,et al.  A Theorem on Boolean Matrices , 1962, JACM.

[14]  Shaul Dar Augmenting Databases with Generalized Transitive Closure , 1993 .

[15]  Václav Koubek,et al.  A Reduct-and-Closure Algorithm for Graphs , 1979, MFCS.

[16]  Bin Jiang Design, analysis, and evaluation of algorithms for computing partial transitive closures in databases , 1990 .

[17]  Raghu Ramakrishnan,et al.  Efficient Transitive Closure Algorithms , 1988, VLDB.

[18]  H. V. Jagadish,et al.  Direct Algorithms for Computing the Transitive Closure of Database Relations , 1987, VLDB.

[19]  Hamid Pirahesh,et al.  Overbound and right-linear queries , 1991, PODS.

[20]  Michael J. Carey,et al.  Performance evaluation of algorithms for transitive closure , 1992, Inf. Syst..

[21]  Håkan Jakobsson,et al.  Mixed-approach algorithms for transitive closure (extended abstract) , 1991, PODS '91.

[22]  Håkan Jakobsson,et al.  Mixed-approach algorithms for transitive closure (extended abstract) , 1991, ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems.

[23]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[24]  J. D. Uiiman,et al.  Principles of Database Systems , 2004, PODS 2004.

[25]  Kien A. Hua,et al.  Efficient evaluation of traversal recursive queries using connectivity index , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[26]  Mihalis Yannakakis,et al.  Graph-theoretic methods in database theory , 1990, PODS.

[27]  Paul Walton Purdom,et al.  A transitive closure algorithm , 1970, BIT.