Compression of weighted graphs

We propose to compress weighted graphs (networks), motivated by the observation that large networks of social, biological, or other relations can be complex to handle and visualize. In the process also known as graph simplification, nodes and (unweighted) edges are grouped to supernodes and superedges, respectively, to obtain a smaller graph. We propose models and algorithms for weighted graphs. The interpretation (i.e. decompression) of a compressed, weighted graph is that a pair of original nodes is connected by an edge if their supernodes are connected by one, and that the weight of an edge is approximated to be the weight of the superedge. The compression problem now consists of choosing supernodes, superedges, and superedge weights so that the approximation error is minimized while the amount of compression is maximized. In this paper, we formulate this task as the 'simple weighted graph compression problem'. We then propose a much wider class of tasks under the name of 'generalized weighted graph compression problem'. The generalized task extends the optimization to preserve longer-range connectivities between nodes, not just individual edge weights. We study the properties of these problems and propose a range of algorithms to solve them, with different balances between complexity and quality of the result. We evaluate the problems and algorithms experimentally on real networks. The results indicate that weighted graphs can be compressed efficiently with relatively little compression error.

[1]  Micah Adler,et al.  Towards compressing Web graphs , 2001, Proceedings DCC 2001. Data Compression Conference.

[2]  Jiawei Han,et al.  Mining Graph Patterns Efficiently via Randomized Summaries , 2009, Proc. VLDB Endow..

[3]  Michael C. Schatz,et al.  Revealing Biological Modules via Graph Summarization , 2009, J. Comput. Biol..

[4]  Jignesh M. Patel,et al.  Discovery-driven graph summarization , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[5]  H. White,et al.  STRUCTURAL EQUIVALENCE OF INDIVIDUALS IN SOCIAL NETWORKS , 1977 .

[6]  Christos Faloutsos,et al.  Fast discovery of connection subgraphs , 2004, KDD.

[7]  Fang Zhou,et al.  A Framework for Path-Oriented Network Simplification , 2010, IDA.

[8]  Ulrich Elsner,et al.  Graph partitioning - a survey , 2005 .

[9]  Philip S. Yu,et al.  Graph OLAP: Towards Online Analytical Processing on Graphs , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[10]  Godfried T. Toussaint,et al.  The relative neighbourhood graph of a finite planar set , 1980, Pattern Recognit..

[11]  Nisheeth Shrivastava,et al.  Graph summarization with bounded error , 2008, SIGMOD Conference.

[12]  Hannu Toivonen,et al.  Finding reliable subgraphs from large probabilistic graphs , 2008, Data Mining and Knowledge Discovery.

[13]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[14]  S. Borgatti,et al.  Regular blockmodels of multiway, multimode matrices☆ , 1992 .

[15]  Jignesh M. Patel,et al.  Efficient aggregation for graph summarization , 2008, SIGMOD Conference.

[16]  Jiawei Han,et al.  Parallel PathFinder Algorithms for Mining Structures from Graphs , 2009, 2009 Ninth IEEE International Conference on Data Mining.