Accelerating community-search problem through faster graph dedensification

Community-search is the problem of finding a densely connected subgraph from a large graph, for given set of query nodes. It is useful, for example, when some high profiled researchers want to arrange a conference or some celebrities wish to arrange a party, and are searching for other researchers and similar personalties respectively. Technically, the problem involves maximizing the minimum degree to find a highly connected subgraph. Well known Greedy algorithm for this purpose, iteratively deletes nodes with minimum degree to meet certain objective function. We observe that Greedy operates in an in-efficient manner due to densely connected regions in a graph, referred as Hot Spots. In this paper, we provide a new concept of performing community-search on a dedensified graph. Our aim is to sparsify the hot spots to accelerate global searching method of Greedy to make it applicable on a large graph. Recently a graph dedensification approach has been proposed that adds Compressor Nodes in a graph for dedensification. However, this method is in-efficient since it has to traverse entire graph to compress the hot spots. To solve this problem, we propose a faster graph dedensification algorithm by using Locality Sensitive Hashing (LSH). We improve time complexity of existing graph dedensification method from O (|E|) to O (|N|+|eHDN|+k), where N, E, eHDN, and k are nodes, edges, edges of High Degree Nodes (HDN), and hash functions respectively, and |N| ≫ |HDN|. Once the graph is dedensified, we use it to accelerate community-search operation. We perform experiments on two real world graphs, and observe significant improvement in execution time of Greedy algorithm, add lesser compressor nodes, and perform reduced traversals for graph dedensification.

[1]  Hisao Tamaki,et al.  Greedily Finding a Dense Subgraph , 2000, J. Algorithms.

[2]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[3]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[4]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[5]  Haixun Wang,et al.  Local search of communities in large graphs , 2014, SIGMOD Conference.

[6]  Gregory Buehrer,et al.  A scalable pattern mining approach to web graph compression with communities , 2008, WSDM '08.

[7]  Daniel J. Abadi,et al.  Scalable Pattern Matching over Compressed Graphs via Dedensification , 2016, KDD.

[8]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[9]  Gonzalo Navarro,et al.  Compressed representations for web and social graphs , 2013, Knowledge and Information Systems.

[10]  Aristides Gionis,et al.  The community-search problem and how to plan a successful cocktail party , 2010, KDD.

[11]  Fang Zhou,et al.  Compression of weighted graphs , 2011, KDD.

[12]  Anthony K. H. Tung,et al.  Comparing Stars: On Approximating Graph Edit Distance , 2009, Proc. VLDB Endow..

[13]  Nisheeth Shrivastava,et al.  Graph summarization with bounded error , 2008, SIGMOD Conference.

[14]  Derong Shen,et al.  Searching overlapping communities for group query , 2015, World Wide Web.

[15]  Jeffrey Xu Yu,et al.  Influential Community Search in Large Networks , 2015, Proc. VLDB Endow..

[16]  Christos Faloutsos,et al.  SlashBurn: Graph Compression and Mining beyond Caveman Communities , 2014, IEEE Transactions on Knowledge and Data Engineering.

[17]  Kifayat-Ullah Khan Set-based approach for lossless graph summarization using Locality Sensitive Hashing , 2015, 2015 31st IEEE International Conference on Data Engineering Workshops.

[18]  Young-Koo Lee,et al.  SPORE: shortest path overlapped regions and confined traversals towards graph clustering , 2014, Applied Intelligence.

[19]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[20]  Young-Koo Lee,et al.  Set-based approximate approach for lossless graph summarization , 2015, Computing.

[21]  Santo Fortunato,et al.  Community detection in networks: A user guide , 2016, ArXiv.