Online local communities with motifs

A community in a network is a set of nodes that are densely and closely connected within the set, yet sparsely connected to nodes outside of it. Detecting communities in large networks helps solve many real-world problems. However, detecting such communities in a complex network by focusing on the whole network is costly. Instead, one can focus on finding overlapping communities starting from one or more seed nodes of interest. Moreover, on the online setting the network is given as a stream of higher order structures, i.e., triangles of nodes to be clustered into communities.In this paper, we propose an on online local graph community detection algorithm that uses motifs, such as triangles of nodes. We provide experimental results and compare it to another algorithm named COEUS. We use two public datasets, one of Amazon data and the other of DBLP data. Furthermore, we create and experiment on a new dataset that consists of web pages and their links by using the Internet Archive. This latter dataset provides insights to better understand how working with motifs is different than working with edges.

[1]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[2]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[3]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[4]  Fan Chung Graham,et al.  Using PageRank to Locally Partition a Graph , 2007, Internet Math..

[5]  Andrew McGregor,et al.  Graph stream algorithms: a survey , 2014, SGMD.

[6]  Alex Delis,et al.  COEUS: Community detection via seed-set expansion on graph streams , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[7]  Fan Chung Graham,et al.  Local Graph Partitioning using PageRank Vectors , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[8]  Kevin J. Lang,et al.  Communities from seed sets , 2006, WWW '06.

[9]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[10]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[12]  Fan Chung Graham,et al.  Local Partitioning for Directed Graphs Using PageRank , 2007, Internet Math..

[13]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[14]  Inderjit S. Dhillon,et al.  Overlapping community detection using seed set expansion , 2013, CIKM.

[15]  R. Lambiotte,et al.  Line graphs, link partitions, and overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.