Selecting the Optimal Groups: Efficiently Computing Skyline k-Cliques

In many applications, graphs often involve the nodes with multi-dimensional numerical attributes, and it is desirable to retrieve a group of nodes that are both highly connected (e.g., clique) and optimal according to some ranking functions. It is well known that the skyline returns candidates for the optimal objects when ranking functions are not specified. Motivated by this, in this paper we formulate the novel model of skyline k-cliques over multi-valued attributed graphs and develop efficient algorithms to conduct the computation. To verify the group based dominance between two k-cliques, we make use of maximum bipartite matching and develop a set of optimization techniques to improve the verification efficiency. Then, a progressive computation algorithm is developed which enumerates the k-cliques in an order such that a k-clique is guaranteed not to be dominated by those generated after it. Novel pruning and early termination techniques are developed to exclude unpromising nodes or cliques by investigating the structural and attribute properties of the multi-valued attributed graph. Empirical studies on four real datasets demonstrate the effectiveness of the skyline k-clique model and the efficiency of the novel computing techniques.

[1]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[2]  Chengqi Zhang,et al.  Locally Densest Subgraph Discovery , 2015, KDD.

[3]  Philip S. Yu,et al.  Efficient Computation of G-Skyline Groups , 2018, IEEE Transactions on Knowledge and Data Engineering.

[4]  Reynold Cheng,et al.  Effective Community Search for Large Attributed Graphs , 2016, Proc. VLDB Endow..

[5]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[6]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[7]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[8]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[9]  Jian Pei,et al.  Finding Pareto Optimal Groups: Group-based Skyline , 2015, Proc. VLDB Endow..

[10]  Jeffrey Xu Yu,et al.  Influential Community Search in Large Networks , 2015, Proc. VLDB Endow..

[11]  Jeffrey Xu Yu,et al.  Querying k-truss community in large and dynamic graphs , 2014, SIGMOD Conference.

[12]  R. Luce,et al.  A method of matrix analysis of group structure , 1949, Psychometrika.

[13]  Gautam Das,et al.  On Skyline Groups , 2012, IEEE Transactions on Knowledge and Data Engineering.

[14]  Maximilien Danisch,et al.  Listing k-cliques in Sparse Real-World Graphs* , 2018, WWW.

[15]  Jeffrey Xu Yu,et al.  Finding maximal cliques in massive networks , 2011, TODS.

[16]  D. J. A. Welsh,et al.  An upper bound for the chromatic number of a graph and its application to timetabling problems , 1967, Comput. J..

[17]  P. Hall On Representatives of Subsets , 1935 .

[18]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[19]  Kazuhisa Makino,et al.  New Algorithms for Enumerating All Maximal Cliques , 2004, SWAT.

[20]  FangYixiang,et al.  Effective community search for large attributed graphs , 2016, VLDB 2016.

[21]  Lijun Chang,et al.  An Optimal and Progressive Approach to Online Search of Top-K Influential Communities , 2017, Proc. VLDB Endow..

[22]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[23]  Kenli Li,et al.  Progressive Approaches for Pareto Optimal Groups Computation , 2019, IEEE Transactions on Knowledge and Data Engineering.

[24]  Fanghua Ye,et al.  Skyline Community Search in Multi-valued Networks , 2018, SIGMOD Conference.

[25]  Lijun Chang,et al.  Diversified top-k clique search , 2015, The VLDB Journal.

[26]  Xu Chen,et al.  Fast Algorithms for Pareto Optimal Group-based Skyline , 2017, CIKM.

[27]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.