Network Compression by Node and Edge Mergers

We give methods to compress weighted graphs (i.e., networks or BisoNets) into smaller ones. The motivation is that large networks of social, biological, or other relations can be complex to handle and visualize. Using the given methods, nodes and edges of a give graph are grouped to supernodes and superedges, respectively. The interpretation (i.e. decompression) of a compressed graph is that a pair of original nodes is connected by an edge if their supernodes are connected by one, and that the weight of an edge equals the weight of the superedge. The compression problem then consists of choosing supernodes, superedges, and superedge weights so that the approximation error is minimized while the amount of compression is maximized. In this chapter, we describe this task as the 'simple weighted graph compression problem'. We also discuss a much wider class of tasks under the name of 'generalized weighted graph compression problem'. The generalized task extends the optimization to preserve longer-range connectivities between nodes, not just individual edge weights. We study the properties of these problems and outline a range of algorithms to solve them, with different trade-offs between complexity and quality of the result. We evaluate the problems and algorithms experimentally on real networks. The results indicate that weighted graphs can be compressed efficiently with relatively little compression error.

[1]  Paul R. Cohen,et al.  Advances in Intelligent Data Analysis IX, 9th International Symposium, IDA 2010, Tucson, AZ, USA, May 19-21, 2010. Proceedings , 2010, IDA.

[2]  Christos Faloutsos,et al.  Fast discovery of connection subgraphs , 2004, KDD.

[3]  Fang Zhou,et al.  A Framework for Path-Oriented Network Simplification , 2010, IDA.

[4]  Philip S. Yu,et al.  Graph OLAP: Towards Online Analytical Processing on Graphs , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[5]  Hannu Toivonen,et al.  Finding reliable subgraphs from large probabilistic graphs , 2008, Data Mining and Knowledge Discovery.

[6]  S. Borgatti,et al.  Regular blockmodels of multiway, multimode matrices☆ , 1992 .

[7]  Fang Zhou,et al.  Compression of weighted graphs , 2011, KDD.

[8]  Jiawei Han,et al.  Parallel PathFinder Algorithms for Mining Structures from Graphs , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[9]  Ulrich Elsner,et al.  Graph partitioning - a survey , 2005 .

[10]  Michael C. Schatz,et al.  Revealing Biological Modules via Graph Summarization , 2009, J. Comput. Biol..

[11]  Jiawei Han,et al.  Mining Graph Patterns Efficiently via Randomized Summaries , 2009, Proc. VLDB Endow..

[12]  Godfried T. Toussaint,et al.  The relative neighbourhood graph of a finite planar set , 1980, Pattern Recognit..

[13]  Jignesh M. Patel,et al.  Discovery-driven graph summarization , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[14]  Jignesh M. Patel,et al.  Efficient aggregation for graph summarization , 2008, SIGMOD Conference.

[15]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[16]  H. White,et al.  “Structural Equivalence of Individuals in Social Networks” , 2022, The SAGE Encyclopedia of Research Design.

[17]  Michael R. Berthold Bisociative Knowledge Discovery , 2011, IDA.

[18]  Fang Zhou,et al.  Review of BisoNet Abstraction Techniques , 2012, Bisociative Knowledge Discovery.

[19]  Micah Adler,et al.  Towards compressing Web graphs , 2001, Proceedings DCC 2001. Data Compression Conference.

[20]  Nisheeth Shrivastava,et al.  Graph summarization with bounded error , 2008, SIGMOD Conference.

[21]  Tobias Kötter,et al.  From Information Networks to Bisociative Information Networks , 2012, Bisociative Knowledge Discovery.