Optimal Representation for Web and Social Network Graphs Based on ${K}^{2}$ -Tree

With the rapid growth of the Internet, the scale of graphs has increased dramatically, which poses special challenges in representing both web graphs and social network graphs. In the adjacency matrix of web and social network graphs, only a very small proportion of the elements is “1” s. Furthermore, we find that using the aggregation of scattered 1 s to form a high density of adjacency matrices is beneficial to the compression of storage space. Based on these findings, we propose the DGC-<italic>K</italic><sup>2</sup>-tree compression approach based on <italic>K</italic><sup>2</sup>-tree, which can greatly increase the density of 1 s among the existing algorithms and adequately compress the blank area in the adjacency matrix. Then, we design a query algorithm for this mechanism to support the operation on the graph. The experimental results show that compared with the state-of-the-art algorithms, including the <italic>K</italic><sup>2</sup>-tree based on a diagonal clustering mechanism (<italic>K</italic><sup>2</sup>-BDC), the <italic>K</italic><sup>2</sup>-tree, Re-Pair, and LZ78, our approach achieves better compression ratio and shorter time consumption. In terms of storage efficiency, our approach reduces the space by an average of 34.07% compared to the best performing algorithm <italic>K</italic><sup>2</sup>-BDC. In terms of query efficiency, our approach reduces the time by an average of 80.63% compared to the best performing algorithm LZ78.

[1]  Alberto Apostolico,et al.  Graph Compression by BFS , 2009, Algorithms.

[2]  Silvio Lattanzi,et al.  On compressing social networks , 2009, KDD.

[3]  Dan Suciu,et al.  UnQL: a query language and algebra for semistructured data based on structural recursion , 2000, The VLDB Journal.

[4]  Jeffrey Xu Yu,et al.  All-in-One: Graph Processing in RDBMSs Revisited , 2017, SIGMOD Conference.

[5]  Susana Ladra,et al.  Practical representations for web and social graphs , 2011, CIKM '11.

[6]  Michael Nelson,et al.  On compressing massive streaming graphs with Quadtrees , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[7]  Richard Hill,et al.  Optimizing K2 trees: A case for validating the maturity of network of practices , 2012, Comput. Math. Appl..

[8]  Gonzalo Navarro,et al.  Compressed Representation of Web and Social Networks via Dense Subgraphs , 2012, SPIRE.

[9]  Alistair Moffat,et al.  Off-line dictionary-based compression , 1999, Proceedings of the IEEE.

[10]  Sebastiano Vigna,et al.  The Webgraph framework II: codes for the World-Wide Web , 2004, Data Compression Conference, 2004. Proceedings. DCC 2004.

[11]  Rajeev Raman,et al.  Succinct indexable dictionaries with applications to encoding k-ary trees and multisets , 2002, SODA '02.

[12]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[13]  Liang Chang,et al.  Optimal Representation of Large-Scale Graph Data Based on K2-Tree , 2017, Wirel. Pers. Commun..

[14]  Gonzalo Navarro,et al.  Fast and Compact Web Graph Representations , 2010, TWEB.

[15]  Dan Suciu,et al.  Declarative specification of Web sites with Strudel , 2000, The VLDB Journal.

[16]  Torsten Hoefler,et al.  Survey and Taxonomy of Lossless Graph Compression and Space-Efficient Graph Representations , 2018, ArXiv.

[17]  G. Navarro,et al.  k 2-Trees for CompactWebGraphRepresentation , 2009 .

[18]  Alberto O. Mendelzon,et al.  GraphLog: a visual formalism for real life recursion , 1990, PODS '90.

[19]  Yu Zhang,et al.  Delta-K 2-tree for Compact Representation of Web Graphs , 2014, APWeb.

[20]  Juan Sequeda,et al.  G-CORE: A Core for Future Graph Query Languages , 2017, SIGMOD Conference.

[21]  A. Chatterjee,et al.  Exploiting topological structures for graph compression based on quadtrees , 2016, 2016 Second International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN).

[22]  Gregory Buehrer,et al.  A scalable pattern mining approach to web graph compression with communities , 2008, WSDM '08.

[23]  G. Navarro,et al.  Compression of Web and Social Graphs supporting Neighbor and Community Queries , 2011 .

[24]  Michael Nelson,et al.  Queryable compression on streaming social networks , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[25]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[26]  Gonzalo Navarro,et al.  Compact representation of Web graphs with extended functionality , 2014, Inf. Syst..

[27]  Gonzalo Navarro,et al.  Compressed representations for web and social graphs , 2013, Knowledge and Information Systems.