论文信息 - TuNao: A High-Performance and Energy-Efficient Reconfigurable Accelerator for Graph Processing

TuNao: A High-Performance and Energy-Efficient Reconfigurable Accelerator for Graph Processing

Large-scale graph processing is now a crucial task of many commercial applications, and it is conventionally supported by general-purpose processors. These processors are designed to flexibly support highly diverse workloads with classic techniques such as on-chip cache and dynamic pipelining. Yet, it is difficult for the on-chip cache to exploit irregular data locality in large-scale graph processing, even though there are a few high-degree vertices that are frequently accessed in real-world graphs, it is not efficient to perform regular arithmetic operations via sophisticated dynamic pipelining. In short, general-purpose processors could not be the ideal platforms to graph processing. In this paper, we design a reconfigurable graph processing accelerator, with the purpose of providing an energy-efficient and flexible hardware platform for large-scale graph processing. This accelerator features two main components, i.e., the on-chip storage to exploit the data locality of graph processing, and the reconfigurable functional units to adapt to diversified operations in different graph processing tasks. On a total of 36 practical graph processing tasks, we demonstrate that, on average, our accelerator design achieves 1.58x and 25.56x better performance and energy efficiency, respectively, than the GPU baseline.

[1] Joseph Gonzalez,et al. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[2] Kiyoung Choi,et al. A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[3] Margaret Martonosi,et al. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[4] Paolo Ienne,et al. Elastic CGRAs , 2013, FPGA '13.

[5] Keval Vora,et al. CuSha: vertex-centric graph processing on GPUs , 2014, HPDC '14.

[6] Guy E. Blelloch,et al. GraphChi: Large-Scale Graph Computation on Just a PC , 2012, OSDI.

[7] Willy Zwaenepoel,et al. X-Stream: edge-centric graph processing using streaming partitions , 2013, SOSP.

[8] Chao Wang,et al. SODA: Software defined FPGA based accelerators for big data , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[9] Michalis Faloutsos,et al. On power-law relationships of the Internet topology , 1999, SIGCOMM '99.