Truss Decomposition in Massive Networks

The k-truss is a type of cohesive subgraphs proposed recently for the study of networks. While the problem of computing most cohesive subgraphs is NP-hard, there exists a polynomial time algorithm for computing k-truss. Compared with k-core which is also efficient to compute, k-truss represents the "core" of a k-core that keeps the key information of, while filtering out less important information from, the k-core. However, existing algorithms for computing k-truss are inefficient for handling today's massive networks. We first improve the existing in-memory algorithm for computing k-truss in networks of moderate size. Then, we propose two I/O-efficient algorithms to handle massive networks that cannot fit in main memory. Our experiments on real datasets verify the efficiency of our algorithms and the value of k-truss.

[1]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[2]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[3]  R. J. Mokken,et al.  Cliques, clubs and clans , 1979 .

[4]  Hideo Matsuda,et al.  Classifying Molecular Sequences Using a Linkage Graph With Their Pairwise Similarities , 1999, Theor. Comput. Sci..

[5]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[6]  David Krackhardt,et al.  Cognitive social structures , 1987 .

[7]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[8]  Alessandro Vespignani,et al.  Large scale networks fingerprinting and visualization using the k-core decomposition , 2005, NIPS.

[9]  S H Strogatz,et al.  Random graph models of social networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Jeffrey Xu Yu,et al.  Finding maximal cliques in massive networks , 2011, TODS.

[11]  Vladimir Batagelj,et al.  Short cycle connectivity , 2007, Discret. Math..

[12]  James Cheng,et al.  Fast algorithms for maximal clique enumeration with limited memory , 2012, KDD.

[13]  James Cheng,et al.  Triangle listing in massive networks , 2012, TKDD.

[14]  Thomas Schank,et al.  Algorithmic Aspects of Triangle-Based Network Analysis , 2007 .

[15]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[16]  Stephen B. Seidman,et al.  A graph‐theoretic generalization of the clique concept* , 1978 .

[17]  Sergei Vassilvitskii,et al.  Counting triangles and the curse of the last reducer , 2011, WWW.

[18]  R. Hanneman Introduction to Social Network Methods , 2001 .

[19]  R. Luce,et al.  A method of matrix analysis of group structure , 1949, Psychometrika.

[20]  Jeffrey Xu Yu,et al.  Finding maximal cliques in massive networks by H*-graph , 2010, SIGMOD Conference.

[21]  Jian Pei,et al.  On mining cross-graph quasi-cliques , 2005, KDD '05.

[22]  Matthieu Latapy,et al.  Main-memory triangle computations for very large (sparse (power-law)) graphs , 2008, Theor. Comput. Sci..

[23]  Yuval Shavitt,et al.  A model of Internet topology using k-shell decomposition , 2007, Proceedings of the National Academy of Sciences.

[24]  James Cheng,et al.  Triangle listing in massive networks and its applications , 2011, KDD.

[25]  R. Luce,et al.  Connectivity and generalized cliques in sociometric group structure , 1950, Psychometrika.

[26]  Sandra Sudarsky,et al.  Massive Quasi-Clique Detection , 2002, LATIN.

[27]  Anthony K. H. Tung,et al.  On Triangulation-based Dense Neighborhood Graphs Discovery , 2010, Proc. VLDB Endow..

[28]  David Eppstein,et al.  Listing All Maximal Cliques in Sparse Graphs in Near-optimal Time , 2010, Exact Complexity of NP-hard Problems.

[29]  James Cheng,et al.  Efficient core decomposition in massive networks , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[30]  Jonathan Cohen,et al.  Graph Twiddling in a MapReduce World , 2009, Computing in Science & Engineering.

[31]  Alok Aggarwal,et al.  The input/output complexity of sorting and related problems , 1988, CACM.

[32]  José Ignacio Alvarez-Hamelin,et al.  How the k-core decomposition helps in understanding the Internet Topology , 2006 .