Compressing big graph data: A relative node importance approach

Given a complex graph of large network, people always want to capture its desirable information of interest. In order to dig the underlying structure of the extremely large graph dataset, compression is necessary, which also helps in further communication and computation. In this paper, by invoking a newly proposed node centrality metric named relative node importance (RNI), an effective compressing scheme is presented for complex graph datasets. Besides measuring the distance and connectivity distribution of the graph structure, we firstly take k-core distribution into consideration. Compared with the existing schemes, the proposed one has lower computational complexity and fits different kinds of networks, e.g., social network, the World Wide Web (WWW) and autonomous systems (AS) network. Numerical results show that our RNI-based method outperforms other schemes and well preserving the basic features of a graph.

[1]  Sebastiano Vigna,et al.  Axioms for Centrality , 2013, Internet Math..

[2]  Marco Rosa,et al.  Robustness of social and web graphs to node removal , 2013, Social Network Analysis and Mining.

[3]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[4]  Alistair Moffat,et al.  Off-line dictionary-based compression , 1999, Proceedings of the IEEE.

[5]  Christos Faloutsos,et al.  ANF: a fast and scalable tool for data mining in massive graphs , 2002, KDD.

[6]  Gonzalo Navarro,et al.  k2-Trees for Compact Web Graph Representation , 2009, SPIRE.

[7]  Sebastian E. Ahnert,et al.  Generalised power graph compression reveals dominant relationship patterns in complex networks , 2014, Scientific Reports.

[8]  Alessandro Vespignani,et al.  K-core decomposition of Internet graphs: hierarchies, self-similarity and measurement biases , 2005, Networks Heterog. Media.

[9]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[10]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[11]  Gregory Buehrer,et al.  A scalable pattern mining approach to web graph compression with communities , 2008, WSDM '08.

[12]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[13]  Sebastian Maneth,et al.  Compressing graphs by grammars , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[14]  J. Anthonisse The rush in a directed graph , 1971 .

[15]  Anna C. Gilbert,et al.  Compressing Network Graphs , 2004 .

[16]  Gene H. Golub,et al.  Extrapolation methods for accelerating PageRank computations , 2003, WWW '03.