论文信息 - A lightweight infrastructure for graph analytics

A lightweight infrastructure for graph analytics

Several domain-specific languages (DSLs) for parallel graph analytics have been proposed recently. In this paper, we argue that existing DSLs can be implemented on top of a general-purpose infrastructure that (i) supports very fine-grain tasks, (ii) implements autonomous, speculative execution of these tasks, and (iii) allows application-specific control of task scheduling policies. To support this claim, we describe such an implementation called the Galois system. We demonstrate the capabilities of this infrastructure in three ways. First, we implement more sophisticated algorithms for some of the graph analytics problems tackled by previous DSLs and show that end-to-end performance can be improved by orders of magnitude even on power-law graphs, thanks to the better algorithms facilitated by a more general programming model. Second, we show that, even when an algorithm can be expressed in existing DSLs, the implementation of that algorithm in the more general system can be orders of magnitude faster when the input graphs are road networks and similar graphs with high diameter, thanks to more sophisticated scheduling. Third, we implement the APIs of three existing graph DSLs on top of the common infrastructure in a few hundred lines of code and show that even for power-law graphs, the performance of the resulting implementations often exceeds that of the original DSL systems, thanks to the lightweight infrastructure.

[1] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .

[2] Ulrich Meyer,et al. Delta-Stepping: A Parallel Single Source Shortest Path Algorithm , 1998, ESA.

[3] Nir Shavit,et al. Skiplist-based concurrent priority queues , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[4] Karen Rose,et al. What is Twitter , 2009 .

[5] Kathryn S. McKinley,et al. Hoard: a scalable memory allocator for multithreaded applications , 2000, SIGP.

[6] Rizal Setya Perdana. What is Twitter , 2013 .

[7] Ronald L. Rivest,et al. Introduction to Algorithms , 1990 .

[8] Nir Shavit,et al. Scalable concurrent priority queue algorithms , 1999, PODC '99.

[9] Joseph JáJá,et al. An Introduction to Parallel Algorithms , 1992 .

[10] Keshav Pingali,et al. Optimistic parallelism requires abstractions , 2007, PLDI '07.

[11] Anne Rogers,et al. Process decomposition through locality of reference , 1989, PLDI '89.

[12] Kunle Olukotun,et al. A practical concurrent binary search tree , 2010, PPoPP '10.

[13] Dimitrios S. Nikolopoulos,et al. Scalable locality-conscious multithreaded memory allocation , 2006, ISMM '06.

[14] Nancy M. Amato,et al. Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[15] Guy E. Blelloch,et al. Ligra: a lightweight graph processing framework for shared memory , 2013, PPoPP '13.