Speedup Graph Processing by Graph Ordering

The CPU cache performance is one of the key issues to efficiency in database systems. It is reported that cache miss latency takes a half of the execution time in database systems. To improve the CPU cache performance, there are studies to support searching including cache-oblivious, and cache-conscious trees. In this paper, we focus on CPU speedup for graph computing in general by reducing the CPU cache miss ratio for different graph algorithms. The approaches dealing with trees are not applicable to graphs which are complex in nature. In this paper, we explore a general approach to speed up CPU computing, in order to further enhance the efficiency of the graph algorithms without changing the graph algorithms (implementations) and the data structures used. That is, we aim at designing a general solution that is not for a specific graph algorithm, neither for a specific data structure. The approach studied in this work is graph ordering, which is to find the optimal permutation among all nodes in a given graph by keeping nodes that will be frequently accessed together locally, to minimize the CPU cache miss ratio. We prove the graph ordering problem is NP-hard, and give a basic algorithm with a bounded approximation. To improve the time complexity of the basic algorithm, we further propose a new algorithm to reduce the time complexity and improve the efficiency with new optimization techniques based on a new data structure. We conducted extensive experiments to evaluate our approach in comparison with other 9 possible graph orderings (such as the one obtained by METIS) using 8 large real graphs and 9 representative graph algorithms. We confirm that our approach can achieve high performance by reducing the CPU cache miss ratios.

[1]  E. Cockayne Domination of undirected graphs — A survey , 1978 .

[2]  Laurence A. Wolsey,et al.  An Analysis of Approximations for Finding a Maximum Weight Hamiltonian Circuit , 1979, Oper. Res..

[3]  Ling Huang,et al.  Evolution of social-attribute networks: measurements, modeling, and implications using google+ , 2012, Internet Measurement Conference.

[4]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[5]  Kenneth A. Ross,et al.  Making B+- trees cache conscious in main memory , 2000, SIGMOD '00.

[6]  Lin Ma,et al.  PAGE: A Partition Aware Engine for Parallel Graph Computation , 2015, IEEE Transactions on Knowledge and Data Engineering.

[7]  J. Banerjee,et al.  Clustering a DAG for CAD Databases , 1988, IEEE Trans. Software Eng..

[8]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[9]  Vasek Chvátal,et al.  A Greedy Heuristic for the Set-Covering Problem , 1979, Math. Oper. Res..

[10]  Gerhard J. Woeginger,et al.  The Maximum Traveling Salesman Problem Under Polyhedral Norms , 1998, IPCO.

[11]  Srinivasan Parthasarathy,et al.  Cache-conscious Frequent Pattern Mining on a Modern Processor , 2005, VLDB.

[12]  Peter Lindstrom,et al.  Optimal hierarchical layouts for cache-oblivious search trees , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[13]  Silvio Lattanzi,et al.  On compressing social networks , 2009, KDD.

[14]  John N. Tsitsiklis,et al.  Introduction to Probability , 2002 .

[15]  Marco Rosa,et al.  Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks , 2010, WWW.

[16]  Mohammad Taghi Hajiaghayi,et al.  L22 Spreading Metrics for Vertex Ordering Problems , 2006, SODA.

[17]  Huy T. Vo,et al.  The More the Merrier: Efficient Multi-Source Graph Traversal , 2014, Proc. VLDB Endow..

[18]  Norman E. Gibbs,et al.  The bandwidth problem for graphs and matrices - a survey , 1982, J. Graph Theory.

[19]  Robert Erra,et al.  Reordering Very Large Graphs for Fun & Prot , 2015 .

[20]  Gabriel Kliot,et al.  Streaming graph partitioning for large distributed graphs , 2012, KDD.

[21]  Anastasia Ailamaki,et al.  Improving hash join performance through prefetching , 2004, Proceedings. 20th International Conference on Data Engineering.

[22]  Alberto O. Mendelzon,et al.  Graph clustering and caching , 1994 .

[23]  Charles M. Grinstead,et al.  Introduction to probability , 1999, Statistics for the Behavioural Sciences.

[24]  George Karypis,et al.  Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..

[25]  James R. Larus,et al.  Cache-conscious structure layout , 1999, PLDI '99.

[26]  Kenneth A. Ross,et al.  Cache Conscious Indexing for Decision-Support in Main Memory , 1999, VLDB.

[27]  M. Sharir,et al.  A strong-connectivity algorithm and its applications in data flow analysis. , 2018 .

[28]  Shirish Tatikonda,et al.  From "Think Like a Vertex" to "Think Like a Graph" , 2013, Proc. VLDB Endow..

[29]  L. H. Harper Optimal Assignments of Numbers to Vertices , 1964 .

[30]  Todd C. Mowry,et al.  Compiler-based prefetching for recursive data structures , 1996, ASPLOS VII.

[31]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[32]  Sebastiano Vigna,et al.  Permuting Web Graphs , 2009, WAW.

[33]  Jeremy G. Siek,et al.  The Boost Graph Library - User Guide and Reference Manual , 2001, C++ in-depth series.

[34]  David J. DeWitt,et al.  DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.

[35]  Alan George,et al.  Computer Solution of Large Sparse Positive Definite , 1981 .

[36]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[37]  Kenneth A. Ross,et al.  Database Optimizations for Modern Hardware , 2008, Proceedings of the IEEE.

[38]  Jordi Petit,et al.  Experiments on the minimum linear arrangement problem , 2003, ACM J. Exp. Algorithmics.

[39]  David Harel,et al.  A Multi-scale Algorithm for the Linear Arrangement Problem , 2002, WG.

[40]  Wilfred Ng,et al.  Blogel: A Block-Centric Framework for Distributed Computation on Real-World Graphs , 2014, Proc. VLDB Endow..

[41]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[42]  Viktor K. Prasanna,et al.  Optimizing graph algorithms for improved cache performance , 2004, Proceedings 16th International Parallel and Distributed Processing Symposium.

[43]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[44]  Ilya Safro,et al.  Multilevel algorithms for linear ordering problems , 2009, JEAL.

[45]  Refael Hassin,et al.  An Approximation Algorithm for the Maximum Traveling Salesman Problem , 1998, Inf. Process. Lett..

[46]  Christos Faloutsos,et al.  Beyond 'Caveman Communities': Hubs and Spokes for Graph Compression and Mining , 2011, 2011 IEEE 11th International Conference on Data Mining.

[47]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[48]  Ilya Safro,et al.  Multiscale approach for the network compression-friendly ordering , 2010, J. Discrete Algorithms.

[49]  Johannes Gehrke,et al.  Fast Iterative Graph Computation with Block Updates , 2013, Proc. VLDB Endow..

[50]  Todd C. Mowry,et al.  Improving index performance through prefetching , 2001, SIGMOD '01.

[51]  Guy E. Blelloch,et al.  GraphChi: Large-Scale Graph Computation on Just a PC , 2012, OSDI.

[52]  Pradeep Dubey,et al.  Platform 2015: Intel ® Processor and Platform Evolution for the Next Decade , 2005 .

[53]  Weifa Liang,et al.  Efficiently computing k-edge connected components via graph decomposition , 2013, SIGMOD '13.