Evaluation of Graph Sampling: A Visualization Perspective

Graph sampling is frequently used to address scalability issues when analyzing large graphs. Many algorithms have been proposed to sample graphs, and the performance of these algorithms has been quantified through metrics based on graph structural properties preserved by the sampling: degree distribution, clustering coefficient, and others. However, a perspective that is missing is the impact of these sampling strategies on the resultant visualizations. In this paper, we present the results of three user studies that investigate how sampling strategies influence node-link visualizations of graphs. In particular, five sampling strategies widely used in the graph mining literature are tested to determine how well they preserve visual features in node-link diagrams. Our results show that depending on the sampling strategy used different visual features are preserved. These results provide a complimentary view to metric evaluations conducted in the graph mining literature and provide an impetus to conduct future visualization studies.

[1]  Hawoong Jeong,et al.  Statistical properties of sampled networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Jarke J. van Wijk,et al.  Force‐Directed Edge Bundling for Graph Visualization , 2009, Comput. Graph. Forum.

[3]  Philippe Castagliola,et al.  A Comparison of the Readability of Graphs Using Node-Link and Matrix-Based Representations , 2004, IEEE Symposium on Information Visualization.

[4]  Tamara Munzner,et al.  Ieee Transactions on Visualization and Computer Graphics 1 Tugging Graphs Faster: Efficiently Modifying Path-preserving Hierarchies for Browsing Paths , 2022 .

[5]  Kwan-Liu Ma,et al.  A Treemap Based Method for Rapid Layout of Large Graphs , 2008, 2008 IEEE Pacific Visualization Symposium.

[6]  Jeffrey Xu Yu,et al.  On random walk based graph sampling , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[7]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[8]  James Abello,et al.  ASK-GraphView: A Large Scale Graph Visualization System , 2006, IEEE Transactions on Visualization and Computer Graphics.

[9]  Marko Bajec,et al.  Sampling promotes community structure in social and information networks , 2015, ArXiv.

[10]  Daniel W. Archambault,et al.  The Readability of Path‐Preserving Clusterings of Graphs , 2010, Comput. Graph. Forum.

[11]  Alan J. Dix,et al.  A Taxonomy of Clutter Reduction for Information Visualisation , 2007, IEEE Transactions on Visualization and Computer Graphics.

[12]  Daniel Weiskopf,et al.  Visualizing Fuzzy Overlapping Communities in Networks , 2013, IEEE Transactions on Visualization and Computer Graphics.

[13]  Lisa Singh,et al.  Exploring community structure in biological networks with random graphs , 2013, BMC Bioinformatics.

[14]  Stephen Curial,et al.  Effectively visualizing large networks through sampling , 2005, VIS 05. IEEE Visualization, 2005..

[15]  L. Asz Random Walks on Graphs: a Survey , 2022 .

[16]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[17]  Hanghang Tong,et al.  g-Miner: Interactive Visual Group Mining on Multivariate Graphs , 2015, CHI.

[18]  Mengchen Liu,et al.  A survey on information visualization: recent advances and challenges , 2014, The Visual Computer.

[19]  Peng Xie,et al.  Sampling biases in IP topology measurements , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[20]  Carsten Wiuf,et al.  Subnets of scale-free networks are not scale-free: sampling properties of networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[22]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[23]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[24]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[25]  David E. Breen,et al.  A Simplification Algorithm for Visualizing the Structure of Complex Graphs , 2008, 2008 12th International Conference Information Visualisation.

[26]  Arjan Kuijper,et al.  Visual Analysis of Large Graphs: State‐of‐the‐Art and Future Research Challenges , 2011, Eurographics.

[27]  Tamara Munzner,et al.  GrouseFlocks: Steerable Exploration of Graph Hierarchy Space , 2008, IEEE Transactions on Visualization and Computer Graphics.

[28]  Michael Garland,et al.  On the Visualization of Social and other Scale-Free Networks , 2008, IEEE Transactions on Visualization and Computer Graphics.

[29]  Tanya Y. Berger-Wolf,et al.  Benefits of bias: towards better characterization of network sampling , 2011, KDD.

[30]  Tim Dwyer,et al.  Scalable, Versatile and Simple Constrained Graph Layout , 2009, Comput. Graph. Forum.

[31]  Athina Markopoulou,et al.  On the bias of BFS (Breadth First Search) , 2010, 2010 22nd International Teletraffic Congress (lTC 22).

[32]  Liran Katzir,et al.  Estimating clustering coefficients and size of social networks via random walk , 2013, TWEB.

[33]  Hong Zhou,et al.  Geometry-Based Edge Clustering for Graph Visualization , 2008, IEEE Transactions on Visualization and Computer Graphics.

[34]  Cynthia M. Webster,et al.  Exploring social structure using dynamic three-dimensional color images , 1998 .

[35]  Ivan Herman,et al.  Graph Visualization and Navigation in Information Visualization: A Survey , 2000, IEEE Trans. Vis. Comput. Graph..

[36]  Jian Zhao,et al.  egoSlider: Visual Analysis of Egocentric Network Evolution , 2016, IEEE Transactions on Visualization and Computer Graphics.

[37]  Christophe Hurter,et al.  Graph Bundling by Kernel Density Estimation , 2012, Comput. Graph. Forum.

[38]  Ramana Rao Kompella,et al.  Network Sampling: From Static to Streaming Graphs , 2012, TKDD.

[39]  Bin Yu,et al.  Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs , 2012, ArXiv.

[40]  Danny Holten,et al.  Hierarchical Edge Bundles: Visualization of Adjacency Relations in Hierarchical Data , 2006, IEEE Transactions on Visualization and Computer Graphics.

[41]  Jean-Daniel Fekete,et al.  Task taxonomy for graph visualization , 2006, BELIV '06.

[42]  Lada A. Adamic,et al.  The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[43]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[44]  Yifan Hu,et al.  Multilevel agglomerative edge bundling for visualizing large graphs , 2011, 2011 IEEE Pacific Visualization Symposium.

[45]  M. Vidal,et al.  Effect of sampling on topology predictions of protein-protein interaction networks , 2005, Nature Biotechnology.

[46]  Christian Doerr,et al.  Metric convergence in social network sampling , 2013, HotPlanet '13.

[47]  James Moody,et al.  Peer influence groups: identifying dense clusters in large networks , 2001, Soc. Networks.

[48]  Ulrik Brandes,et al.  Interactive Level-of-Detail Rendering of Large Graphs , 2012, IEEE Transactions on Visualization and Computer Graphics.

[49]  Marko Bajec,et al.  Empirical comparison of network sampling techniques , 2015, ArXiv.

[50]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[51]  Pili Hu,et al.  A Survey and Taxonomy of Graph Sampling , 2013, ArXiv.

[52]  Jure Leskovec,et al.  Planetary-scale views on a large instant-messaging network , 2008, WWW.