Large graph visualizations using a distributed computing platform

Big Data analytics is recognized as one of the major issues in our current information society, and raises several challenges and opportunities in many fields, including economy and finance, e-commerce, public health and administration, national security, and scientific research. The use of visualization techniques to make sense of large volumes of information is an essential ingredient, especially for the analysis of complex interrelated data, which are represented as graphs. The growing availability of powerful and inexpensive cloud computing services naturally motivates the study of distributed graph visualization algorithms, able to scale to the size of large graphs. We study the problem of designing a distributed visualization algorithm that must be simple to implement and whose computing infrastructure does not require major hardware or software investments. We design, implement, and experiment a force-directed algorithm in Giraph, a popular open source framework for distributed computing, based on a vertex-centric design paradigm. The algorithm is tested both on real and artificial graphs with up to one million edges. The experiments show the scalability and effectiveness of our technique when compared to a centralized implementation of the same force-directed model. Graphs with about one million edges can be drawn in a few minutes, by spending about 1 USD per drawing with a cloud computing infrastructure of Amazon.

[1]  Karsten Klein,et al.  An Experimental Evaluation of Multilevel Layout Methods , 2010, GD.

[2]  Lawrence B. Holder,et al.  Mining Graph Data , 2006 .

[3]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[4]  Sunghee Choi,et al.  Efficient algorithms for updating betweenness centrality in fully dynamic graphs , 2016, Inf. Sci..

[5]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[6]  Steven Skiena,et al.  Sorting and Searching , 2012 .

[7]  Mohammed J. Zaki,et al.  Is There a Best Quality Metric for Graph Clusters? , 2011, ECML/PKDD.

[8]  Michael Jünger,et al.  Drawing Large Graphs with a Potential-Field-Based Multilevel Algorithm , 2004, GD.

[9]  Hinge Antoine,et al.  Distributed Graph Layout with Spark , 2015 .

[10]  Wenye Li,et al.  Visualizing network communities with a semi-definite programming method , 2015, Inf. Sci..

[11]  Ben Shneiderman,et al.  Speeding Up Network Layout and Centrality Measures for Social Computing Goals , 2011, SBP.

[12]  Richard C. Dubes,et al.  Cluster Analysis and Related Issues , 1993, Handbook of Pattern Recognition and Computer Vision.

[13]  Ulrik Brandes,et al.  On variants of shortest-path betweenness centrality and their generic computation , 2008, Soc. Networks.

[14]  Michael Jünger,et al.  Crossing Minimization meets Simultaneous Drawing , 2008, 2008 IEEE Pacific Visualization Symposium.

[15]  Roberto Tamassia,et al.  Handbook on Graph Drawing and Visualization , 2013 .

[16]  Peter Eades,et al.  FADE: Graph Drawing, Clustering, and Visual Abstraction , 2000, GD.

[17]  Chun-Cheng Lin,et al.  An integer programming approach and visual analysis for detecting hierarchical community structures in social networks , 2015, Inf. Sci..

[18]  Vipin Kumar,et al.  Parallel Multilevel series k-Way Partitioning Scheme for Irregular Graphs , 1999, SIAM Rev..

[19]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[20]  Emek Demir,et al.  A layout algorithm for undirected compound graphs , 2009, Inf. Sci..

[21]  Paul Vickers,et al.  A survey of two-dimensional graph layout techniques for information visualisation , 2013, Inf. Vis..

[22]  Andrew Lumsdaine,et al.  Distributed force-directed graph layout and visualization , 2006, EGPGV '06.

[23]  Félix Cuadrado,et al.  Adaptive Partitioning for Large-Scale Dynamic Graphs , 2013, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[24]  Hsu-Chun Yen,et al.  Mental map preserving graph drawing using simulated annealing , 2011, Inf. Sci..

[25]  Wenye Li,et al.  Revealing network communities with a nonlinear programming method , 2013, Inf. Sci..

[26]  Martin Schwarick,et al.  Snoopy - A Unifying Petri Net Tool , 2012, Petri Nets.

[27]  Michael Jünger,et al.  The Open Graph Drawing Framework (OGDF) , 2013, Handbook of Graph Drawing and Visualization.

[28]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[29]  John D. Radke,et al.  On the Shape of a Set of Points , 1988 .

[30]  Michael Jünger,et al.  Large-Graph Layout Algorithms at Work: An Experimental Study , 2007, J. Graph Algorithms Appl..

[31]  Rio Yokota,et al.  Scalable Force Directed Graph Layout Algorithms Using Fast Multipole Methods , 2012, 2012 11th International Symposium on Parallel and Distributed Computing.

[32]  Helen C. Purchase,et al.  Twelve years of diagrams research , 2014, J. Vis. Lang. Comput..

[33]  Michael Garland,et al.  Rapid Multipole Graph Drawing on the GPU , 2009, Graph Drawing.

[34]  Andreas Noack,et al.  Energy Models for Graph Clustering , 2007, J. Graph Algorithms Appl..

[35]  Peter Eades,et al.  Shape-Based Quality Metrics for Large Graph Visualization , 2015, J. Graph Algorithms Appl..

[36]  Kwan-Liu Ma,et al.  A Scalable Parallel Force-Directed Graph Layout Algorithm , 2008, EGPGV@Eurographics.

[37]  Weidong Huang,et al.  Evaluating overall quality of graph visualizations based on aesthetics aggregation , 2016, Inf. Sci..

[38]  Kozo Sugiyama Graph Drawing and Applications for Software and Knowledge Engineers , 2002, Series on Software Engineering and Knowledge Engineering.

[39]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[40]  Kwan-Liu Ma,et al.  Rapid Graph Layout Using Space Filling Curves , 2008, IEEE Transactions on Visualization and Computer Graphics.

[41]  Tim Weninger,et al.  Thinking Like a Vertex , 2015, ACM Comput. Surv..

[42]  Ulrik Brandes,et al.  Heuristics for Speeding Up Betweenness Centrality Computation , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[43]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[44]  Marc Olano,et al.  Glimmer: Multilevel MDS on the GPU , 2009, IEEE Transactions on Visualization and Computer Graphics.

[45]  Peter Eades,et al.  A Heuristic for Graph Drawing , 1984 .

[46]  Ugur Dogrusöz,et al.  A layout algorithm for signaling pathways , 2006, Inf. Sci..

[47]  Kang Zhang,et al.  Special Issue on Visual Information Communication - Theory and Practice , 2016, Inf. Sci..

[48]  Sherali Zeadally,et al.  Editorial: Cloud computing service and architecture models , 2014, Inf. Sci..

[49]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[50]  Ulrik Brandes,et al.  Engineering graph clustering: Models and experimental evaluation , 2008, JEAL.

[51]  Ioannis G. Tollis,et al.  Techniques for Edge Stratification of Complex Graph Drawings , 2014, J. Vis. Lang. Comput..

[52]  Bin Shao,et al.  Fast graph mining with HBase , 2015, Inf. Sci..

[53]  Arjan Kuijper,et al.  Visual Analysis of Large Graphs: State‐of‐the‐Art and Future Research Challenges , 2011, Eurographics.

[54]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[55]  Stephan Olariu,et al.  A Simple Parallel Algorithm to Draw Cubic Graphs , 2000, IEEE Trans. Parallel Distributed Syst..

[56]  Stephen G. Kobourov,et al.  Force-Directed Drawing Algorithms , 2013, Handbook of Graph Drawing and Visualization.

[57]  Walter Didimo,et al.  Fast layout computation of clustered networks: Algorithmic advances and experimental analysis , 2014, Inf. Sci..

[58]  Avery Ching,et al.  One Trillion Edges: Graph Processing at Facebook-Scale , 2015, Proc. VLDB Endow..

[59]  Claudio Martella,et al.  Spinner: Scalable Graph Partitioning in the Cloud , 2014, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[60]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[61]  Daniel W. Archambault,et al.  Can animation support the visualisation of dynamic graphs? , 2016, Inf. Sci..

[62]  Meenakshisundaram Gopi,et al.  HD-GraphViz: highly distributed graph visualization on tiled displays , 2012, ICVGIP '12.