Compressed Representation of Web and Social Networks via Dense Subgraphs

Mining and analyzing large web and social networks are challenging tasks in terms of storage and information access. In order to address this problem, several works have proposed compressing large graphs allowing neighbor access over their compressed representations. In this paper, we propose a novel compressed structure aiming to reduce storage and support efficient navigation over web and social graph compressed representations. Our approach uses clustering and mining for finding dense subgraphs and represents them using compact data structures. We perform experiments using a wide range of web and social networks and compare our results with the best known techniques. Our results show that we improve the state of the art space/time tradeoffs for supporting neighbor queries. Our compressed structure also enables mining queries based on dense subgraphs, such as cliques and bicliques.

[1]  Gonzalo Navarro,et al.  Fast and Compact Web Graph Representations , 2010, TWEB.

[2]  Andrei Z. Broder Min-wise Independent Permutations: Theory and Practice , 2000, ICALP.

[3]  Alberto Apostolico,et al.  Graph Compression by BFS , 2009, Algorithms.

[4]  Robin Milner,et al.  On Observing Nondeterminism and Concurrency , 1980, ICALP.

[5]  A. Moffat,et al.  Offline dictionary-based compression , 2000, Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096).

[6]  Gonzalo Navarro,et al.  Practical Rank/Select Queries over Arbitrary Sequences , 2008, SPIRE.

[7]  Susana Ladra,et al.  Practical representations for web and social graphs , 2011, CIKM '11.

[8]  Sebastiano Vigna,et al.  Permuting Web Graphs , 2009, WAW.

[9]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[10]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[11]  Gregory Buehrer,et al.  A scalable pattern mining approach to web graph compression with communities , 2008, WSDM '08.

[12]  Wojciech Rytter,et al.  Extracting Powers and Periods in a String from Its Runs Structure , 2010, SPIRE.

[13]  Departamento de Computación,et al.  Algorithms and Compressed Data Structures for Information Retrieval , 2011 .

[14]  R. González,et al.  PRACTICAL IMPLEMENTATION OF RANK AND SELECT QUERIES , 2005 .

[15]  G. Navarro,et al.  Compression of Web and Social Graphs supporting Neighbor and Community Queries , 2011 .

[16]  Raymie Stata,et al.  The Link Database: fast access to graphs of the Web , 2002, Proceedings DCC 2002. Data Compression Conference.

[17]  Jian Pei,et al.  Neighbor query friendly compression of social networks , 2010, KDD.

[18]  David Richard Clark,et al.  Compact pat trees , 1998 .

[19]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[20]  Roberto Grossi,et al.  High-order entropy-compressed text indexes , 2003, SODA '03.

[21]  S. Srinivasa Rao,et al.  Rank/select operations on large alphabets: a tool for text indexing , 2006, SODA '06.

[22]  Rajeev Raman,et al.  Succinct indexable dictionaries with applications to encoding k-ary trees and multisets , 2002, SODA '02.

[23]  Gonzalo Navarro,et al.  k2-Trees for Compact Web Graph Representation , 2009, SPIRE.

[24]  Marco Rosa,et al.  Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.

[25]  Silvio Lattanzi,et al.  On compressing social networks , 2009, KDD.