An efficient graph accelerator with parallel data conflict management
暂无分享,去创建一个
[1] Kiyoung Choi,et al. A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[2] David A. Patterson,et al. Locality Exists in Graph Processing: Workload Characterization on an Ivy Bridge Server , 2015, 2015 IEEE International Symposium on Workload Characterization.
[3] Yunhao Liu,et al. AnySee: Peer-to-Peer Live Streaming , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.
[4] Aart J. C. Bik,et al. Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.
[5] Ravi Rajwar,et al. Speculative lock elision: enabling highly concurrent multithreaded execution , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[6] Hai Jin,et al. Efficient and Scalable Graph Parallel Processing With Symbolic Execution , 2018, ACM Trans. Archit. Code Optim..
[7] Alexandru Iosup,et al. How Well Do Graph-Processing Platforms Perform? An Empirical Performance Evaluation and Analysis , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[8] Simon Knowles,et al. A family of adders , 1999, Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336).
[9] Viktor K. Prasanna,et al. High-Throughput and Energy-Efficient Graph Processing on FPGA , 2016, 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[10] Joseph Gonzalez,et al. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.
[11] Hyesoon Kim,et al. BSSync: Processing Near Memory for Machine Learning Workloads with Bounded Staleness Consistency Models , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[12] Wenguang Chen,et al. GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning , 2015, USENIX ATC.
[13] Kiyoung Choi,et al. PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[14] H. T. Kung,et al. A Regular Layout for Parallel Adders , 1982, IEEE Transactions on Computers.
[15] Wencong Xiao,et al. GraM: scaling graph computation to the trillions , 2015, SoCC.
[16] James C. Hoe,et al. GraphGen: An FPGA Framework for Vertex-Centric Graph Computation , 2014, FCCM 2014.
[17] Yu Wang,et al. ForeGraph: Exploring Large-scale Graph Processing on Multi-FPGA Architecture , 2017, FPGA.
[18] Bruce Jacob,et al. DRAMSim2: A Cycle Accurate Memory System Simulator , 2011, IEEE Computer Architecture Letters.
[19] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[20] Pradeep Dubey,et al. GraphMat: High performance graph analytics made productive , 2015, Proc. VLDB Endow..
[21] John D. Owens,et al. Gunrock: a high-performance graph processing library on the GPU , 2016, PPoPP 2016.
[22] Maurice Herlihy,et al. Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.
[23] Maya Gokhale,et al. Processing in Memory: The Terasys Massively Parallel PIM Array , 1995, Computer.
[24] Guy E. Blelloch,et al. Ligra: a lightweight graph processing framework for shared memory , 2013, PPoPP '13.
[25] Guy E. Blelloch,et al. Scans as Primitive Parallel Operations , 1989, ICPP.
[26] Harold S. Stone,et al. A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations , 1973, IEEE Transactions on Computers.
[27] Jason Cong,et al. Automatic memory partitioning and scheduling for throughput and power optimization , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.
[28] Margaret Martonosi,et al. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[29] Ramyad Hadidi,et al. GraphPIM: Enabling Instruction-Level PIM Offloading in Graph Computing Frameworks , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[30] Alexandru Iosup,et al. An Empirical Performance Evaluation of GPU-Enabled Graph-Processing Systems , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
[31] K. Pagiamtzis,et al. Content-addressable memory (CAM) circuits and architectures: a tutorial and survey , 2006, IEEE Journal of Solid-State Circuits.
[32] Ozcan Ozturk,et al. Energy Efficient Architecture for Graph Analytics Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[33] Jack Sklansky,et al. Conditional-Sum Addition Logic , 1960, IRE Trans. Electron. Comput..
[34] Mohammed J. Zaki,et al. Arabesque: a system for distributed graph mining , 2015, SOSP.
[35] Yiran Chen,et al. GraphR: Accelerating Graph Processing Using ReRAM , 2017, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[36] Jure Leskovec,et al. {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .
[37] Yafei Dai,et al. Garaph: Efficient GPU-accelerated Graph Processing on a Single Machine with Balanced Replication , 2017, USENIX Annual Technical Conference.
[38] Kunle Olukotun,et al. GraphOps: A Dataflow Library for Graph Analytics Acceleration , 2016, FPGA.
[39] Jason Cong,et al. Memory partitioning for multidimensional arrays in high-level synthesis , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).