Size-Estimation Framework with Applications to Transitive Closure and Reachability

Computing the transitive closure in directed graphs is a fundamental graph problem. We consider the more restricted problem of computing the number of nodes reachable from every node and the size of the transitive closure. The fastest known transitive closure algorithms run inO(min{mn,n2.38}) time, wherenis the number of nodes andmthe number of edges in the graph. We present anO(m) time randomized (Monte Carlo) algorithm that estimates, with small relative error, the sizes of all reachability sets and the transitive closure. Another ramification of our estimation scheme is a O(m) time algorithm for estimating sizes of neighborhoods in directed graphs with nonnegative edge lengths. Our size-estimation algorithms are much faster than performing the respective explicit computations.

[1]  Mihalis Yannakakis,et al.  Graph-theoretic methods in database theory , 1990, PODS.

[2]  Jeffrey F. Naughton,et al.  Estimating the Size of Generalized Transitive Closures , 1989, VLDB.

[3]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[4]  Håkan Jakobsson,et al.  Mixed-approach algorithms for transitive closure (extended abstract) , 1991, ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems.

[5]  R. F.,et al.  Mathematical Statistics , 1944, Nature.

[6]  J. Kiefer Introduction to statistical inference , 1987 .

[7]  P. Billingsley,et al.  Probability and Measure , 1980 .

[8]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[9]  Don Coppersmith,et al.  Matrix multiplication via arithmetic progressions , 1987, STOC.

[10]  P. Spreij Probability and Measure , 1996 .

[11]  Jeffrey F. Naughton,et al.  Practical selectivity estimation through adaptive sampling , 1990, SIGMOD '90.

[12]  Philip N. Klein,et al.  A linear-processor polylog-time algorithm for shortest paths in planar graphs , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[13]  Mihalis Yannakakis,et al.  High-Probability Parallel Transitive-Closure Algorithms , 1991, SIAM J. Comput..

[14]  D. Karger,et al.  Random sampling in graph optimization problems , 1995 .

[15]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[16]  Cohen Yi-Min Wang Gaurav Suri When Piecewise Determinism Is Almost TrueEdith , 1995 .

[17]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[18]  Edith Cohen On Optimizing Multiplications of Sparse Matrices , 1996, IPCO.

[19]  Ming-Yang Kao,et al.  Towards Overcoming the Transitive-Closure Bottleneck: Efficient Parallel Algorithms for Planar Digraphs , 1993, J. Comput. Syst. Sci..

[20]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[21]  Patrick Billingsley,et al.  Probability and Measure. , 1986 .

[22]  Håkan Jakobsson,et al.  On tree-based techniques for query evaluation , 1992, PODS.

[23]  Ming-Yang Kao,et al.  Parallel Depth-First Search in General Directed Graphs , 1990, SIAM J. Comput..

[24]  Jeffrey F. Naughton,et al.  Query Size Estimation by Adaptive Sampling , 1995, J. Comput. Syst. Sci..

[25]  Gary L. Miller,et al.  A contraction procedure for planar directed graphs , 1992, SPAA '92.

[26]  Shaul Dar Augmenting Databases with Generalized Transitive Closure , 1993 .