Efficient and effective community search

Community search is the problem of finding a good community for a given set of query vertices. One of the most studied formulations of community search asks for a connected subgraph that contains all query vertices and maximizes the minimum degree. All existing approaches to min-degree-based community search suffer from limitations concerning efficiency, as they need to visit (large part of) the whole input graph, as well as accuracy, as they output communities quite large and not really cohesive. Moreover, some existing methods lack generality: they handle only single-vertex queries, find communities that are not optimal in terms of minimum degree, and/or require input parameters. In this work we advance the state of the art on community search by proposing a novel method that overcomes all these limitations: it is in general more efficient and effective—one/two orders of magnitude on average, it can handle multiple query vertices, it yields optimal communities, and it is parameter-free. These properties are confirmed by an extensive experimental analysis performed on various real-world graphs.

[1]  Jing Li,et al.  Robust Local Community Detection: On Free Rider Effect and Its Elimination , 2015, Proc. VLDB Endow..

[2]  Yehuda Koren,et al.  Measuring and extracting proximity graphs in networks , 2007, TKDD.

[3]  Vladimir Batagelj,et al.  Fast algorithms for determining (generalized) core groups in social networks , 2011, Adv. Data Anal. Classif..

[4]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Christos Faloutsos,et al.  Center-piece subgraphs: problem definition and fast solutions , 2006, KDD '06.

[6]  Robert E. Tarjan,et al.  A linear-time algorithm for a special case of disjoint set union , 1983, J. Comput. Syst. Sci..

[7]  Theodoros Lappas,et al.  Finding a team of experts in social networks , 2009, KDD.

[8]  Aristides Gionis,et al.  The community-search problem and how to plan a successful cocktail party , 2010, KDD.

[9]  Ravi Kumar,et al.  Discovering Large Dense Subgraphs in Massive Graphs , 2005, VLDB.

[10]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[11]  Padhraic Smyth,et al.  A Spectral Clustering Approach To Finding Communities in Graph , 2005, SDM.

[12]  George Markowsky,et al.  A fast algorithm for Steiner trees , 1981, Acta Informatica.

[13]  Chak-Kuen Wong,et al.  A faster approximation algorithm for the Steiner problem in graphs , 1986, Acta Informatica.

[14]  Jeffrey Xu Yu,et al.  Querying k-truss community in large and dynamic graphs , 2014, SIGMOD Conference.

[15]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[16]  Charu C. Aggarwal,et al.  A Survey of Algorithms for Dense Subgraph Discovery , 2010, Managing and Mining Graph Data.

[17]  Jeffrey Xu Yu,et al.  Influential Community Search in Large Networks , 2015, Proc. VLDB Endow..

[18]  Andrew V. Goldberg,et al.  Finding a Maximum Density Subgraph , 1984 .

[19]  Hannu Toivonen,et al.  Link Discovery in Graphs Derived from Biological Databases , 2006, DILS.

[20]  Boleslaw K. Szymanski,et al.  LabelRank: A stabilized label propagation algorithm for community detection in networks , 2013, 2013 IEEE 2nd Network Science Workshop (NSW).

[21]  Ambuj K. Singh,et al.  As Strong as the Weakest Link: Mining Diverse Cliques in Weighted Graphs , 2013, ECML/PKDD.

[22]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[23]  Haixun Wang,et al.  Local search of communities in large graphs , 2014, SIGMOD Conference.

[24]  Samir Khuller,et al.  On Finding Dense Subgraphs , 2009, ICALP.

[25]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[26]  Haixun Wang,et al.  Online search of overlapping communities , 2013, SIGMOD '13.

[27]  Gerhard Weikum,et al.  MING: mining informative entity relationship subgraphs , 2009, CIKM.

[28]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[29]  Moses Charikar,et al.  Greedy approximation algorithms for finding dense components in a graph , 2000, APPROX.