Efficient and Progressive Group Steiner Tree Search

The Group Steiner Tree (GST) problem is a fundamental problem in database area that has been successfully applied to keyword search in relational databases and team search in social networks. The state-of-the-art algorithm for the GST problem is a parameterized dynamic programming (DP) algorithm, which finds the optimal tree in O(3kn+2k(n log n + m)) time, where k is the number of given groups, m and n are the number of the edges and nodes of the graph respectively. The major limitations of the parameterized DP algorithm are twofold: (i) it is intractable even for very small values of k (e.g., k=8) in large graphs due to its exponential complexity, and (ii) it cannot generate a solution until the algorithm has completed its entire execution. To overcome these limitations, we propose an efficient and progressive GST algorithm in this paper, called PrunedDP. It is based on newly-developed optimal-tree decomposition and conditional tree merging techniques. The proposed algorithm not only drastically reduces the search space of the parameterized DP algorithm, but it also produces progressively-refined feasible solutions during algorithm execution. To further speed up the PrunedDP algorithm, we propose a progressive A*-search algorithm, based on several carefully-designed lower-bounding techniques. We conduct extensive experiments to evaluate our algorithms on several large scale real-world graphs. The results show that our best algorithm is not only able to generate progressively-refined feasible solutions, but it also finds the optimal solution with at least two orders of magnitude acceleration over the state-of-the-art algorithm, using much less memory.

[1]  Jianyong Wang,et al.  Progressive Keyword Search in Relational Databases , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[2]  Edmund Ihler,et al.  The Complexity of Approximating the Class Steiner Tree Problem , 1991, WG.

[3]  Theodoros Lappas,et al.  Finding a team of experts in social networks , 2009, KDD.

[4]  K. T. Rve,et al.  Knowledge-Based Anytime Computation , 1995 .

[5]  Samik Datta,et al.  Capacitated team formation problem on social networks , 2012, KDD.

[6]  S. Sudarshan,et al.  Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.

[7]  Christian Böhm,et al.  Efficient Anytime Density-based Clustering , 2013, SDM.

[8]  Guoliang Li,et al.  SAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents , 2009, Inf. Sci..

[9]  Andrew V. Goldberg,et al.  Computing the shortest path: A search meets graph theory , 2005, SODA '05.

[10]  Sudipto Guha,et al.  Rounding via Trees : Deterministic Approximation Algorithms forGroup , 1998 .

[11]  László Méro,et al.  A Heuristic Search Algorithm with Modifiable Estimate , 1984, Artif. Intell..

[12]  Gerhard Weikum,et al.  STAR: Steiner-Tree Approximation in Relationship Graphs , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[13]  Shlomo Zilberstein,et al.  Using Anytime Algorithms in Intelligent Systems , 1996, AI Mag..

[14]  Gabriele Reich,et al.  Beyond Steiner's Problem: A VLSI Oriented Generalization , 1989, WG.

[15]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[16]  Wilfred Ng,et al.  A Comparative Study of Team Formation in Social Networks , 2015, DASFAA.

[17]  Alfred C. Weaver,et al.  Ieee Transactions on Knowledge and Data Engineering 1 an Empirical Performance Evaluation of Relational Keyword Search Techniques , 2022 .

[18]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS '01.

[19]  Dah-Jye Lee,et al.  Anytime Classification Using the Nearest Neighbor Algorithm with Applications to Stream Mining , 2006, Sixth International Conference on Data Mining (ICDM'06).

[20]  Eric A. Hansen,et al.  Anytime Heuristic Search , 2011, J. Artif. Intell. Res..

[21]  R. Ravi,et al.  A polylogarithmic approximation algorithm for the group Steiner tree problem , 2000, SODA '98.

[22]  Rina Dechter,et al.  Generalized best-first search strategies and the optimality of A* , 1985, JACM.

[23]  Yufei Tao,et al.  Progressive computation of the min-dist optimal-location query , 2006, VLDB.

[24]  Luca Becchetti,et al.  Online team formation in social networks , 2012, WWW.

[25]  Philip S. Yu,et al.  BLINKS: ranked keyword searches on graphs , 2007, SIGMOD '07.

[26]  S. E. Dreyfus,et al.  The steiner problem in graphs , 1971, Networks.

[27]  E. Cockayne On the Steiner Problem , 1967, Canadian Mathematical Bulletin.

[28]  Mark de Berg,et al.  Progressive Geometric Algorithms , 2014, Symposium on Computational Geometry.

[29]  S. Sudarshan,et al.  Bidirectional Expansion For Keyword Search on Graph Databases , 2005, VLDB.

[30]  Shaul Markovitch,et al.  Anytime Induction of Decision Trees: An Iterative Improvement Approach , 2006, AAAI.

[31]  Shan Wang,et al.  Finding Top-k Min-Cost Connected Trees in Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[32]  Yehoshua Sagiv,et al.  Finding and approximating top-k answers in keyword proximity search , 2006, PODS '06.