Slice-and-Forge: Making Better Use of Caches for Graph Convolutional Network Accelerators
暂无分享,去创建一个
[1] Philip Levis,et al. GRIP: A Graph Neural Network Accelerator Architecture , 2020, IEEE Transactions on Computers.
[2] Haoran You,et al. I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization , 2021, MICRO.
[3] Paolo Ienne,et al. Large-Scale Graph Processing on FPGAs with Caches for Thousands of Simultaneous Misses , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[4] Arvind,et al. FlexMiner: A Pattern-Aware Accelerator for Graph Pattern Mining , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[5] Tony Nowatzki,et al. PolyGraph: Exposing the Value of Flexibility for Graph Processing Accelerators , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[6] Hanwoong Jung,et al. Sparsity-Aware and Re-configurable NPU Architecture for Samsung Flagship Mobile SoC , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[7] Daniel Sánchez,et al. SpZip: Architectural Support for Effective Data Compression In Irregular Applications , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[8] Yang Wang,et al. Dual-side Sparse Tensor Core , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[9] Chao-Tsung Huang,et al. RingCNN: Exploiting Algebraically-Sparse Ring Tensors for Energy-Efficient CNN-Based Computational Imaging , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[10] Ahmed Louri,et al. GCNAX: A Flexible and Energy-efficient Accelerator for Graph Convolutional Neural Networks , 2021, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).
[11] Dongrui Fan,et al. Hardware Acceleration for GCNs via Bidirectional Fusion , 2021, IEEE Computer Architecture Letters.
[12] Huawei Li,et al. EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks , 2019, IEEE Transactions on Computers.
[13] Long Zheng,et al. A Locality-Aware Energy-Efficient Accelerator for Graph Mining Applications , 2020, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[14] Xing Hu,et al. Rubik: A Hierarchical Architecture for Efficient Graph Learning , 2020, ArXiv.
[15] Rakesh Kumar,et al. Hardware Acceleration of Graph Neural Networks , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).
[16] Bruce Jacob,et al. DRAMsim3: A Cycle-Accurate, Thermal-Capable DRAM Simulator , 2020, IEEE Computer Architecture Letters.
[17] J. Leskovec,et al. Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.
[18] Song Han,et al. SpArch: Efficient Architecture for Sparse Matrix Multiplication , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[19] Dipankar Das,et al. SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[20] Nitish Srivastava,et al. Tensaurus: A Versatile Accelerator for Mixed Sparse-Dense Tensor Computations , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[21] Bahar Asgari,et al. ALRESCHA: A Lightweight Reconfigurable Sparse-Computation Accelerator , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[22] Yuxiao Dong,et al. Microsoft Academic Graph: When experts are not enough , 2020, Quantitative Science Studies.
[23] Dongrui Fan,et al. HyGCN: A GCN Accelerator with Hybrid Architecture , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[24] Yanzhi Wang,et al. PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning , 2020, ASPLOS.
[25] Antonino Tumeo,et al. AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing , 2019, 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[26] Shikhar Vashishth. Neural Graph Embedding Methods for Natural Language Processing , 2019, ArXiv.
[27] Onur Mutlu,et al. SMASH: Co-designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations , 2019, MICRO.
[28] Tze Meng Low,et al. Efficient SpMV Operation for Large and Highly Sparse Matrices using Scalable Multi-way Merge Parallelization , 2019, MICRO.
[29] Aamer Jaleel,et al. ExTensor: An Accelerator for Sparse Tensor Algebra , 2019, MICRO.
[30] Jose-Maria Arnau,et al. Neuron-Level Fuzzy Memoization in RNNs , 2019, MICRO.
[31] Zhiru Zhang,et al. Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating , 2019, MICRO.
[32] Gu-Yeon Wei,et al. MaxNVM: Maximizing DNN Storage Density and Inference Efficiency with Sparse Encoding and Error Mitigation , 2019, MICRO.
[33] T. N. Vijaykumar,et al. SparTen: A Sparse Tensor Accelerator for Convolutional Neural Networks , 2019, MICRO.
[34] Yuan Xie,et al. Sparse Tensor Core: Algorithm and Hardware Co-Design for Vector-wise Sparse Neural Networks on Modern GPUs , 2019, MICRO.
[35] Chao-Tsung Huang,et al. eCNN: A Block-Based and Highly-Parallel CNN Accelerator for Edge Inference , 2019, MICRO.
[36] Yanzhi Wang,et al. GraphQ: Scalable PIM-Based Graph Processing , 2019, MICRO.
[37] William J. Dally,et al. Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture , 2019, MICRO.
[38] Jason Cong,et al. Overcoming Data Transfer Bottlenecks in FPGA-based DNN Accelerators via Layer Conscious Memory Management , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[39] Bernard Ghanem,et al. DeepGCNs: Can GCNs Go As Deep As CNNs? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[40] Tian Jin,et al. Split-CNN: Splitting Window-based Operations in Convolutional Neural Networks for Memory System Optimization , 2019, ASPLOS.
[41] P. Sadayappan,et al. Adaptive sparse tiling for sparse matrix multiplication , 2019, PPoPP.
[42] Jure Leskovec,et al. How Powerful are Graph Neural Networks? , 2018, ICLR.
[43] Ge Li,et al. Mini-batch Serialization: CNN Training with Inter-layer Data Reuse , 2018, MLSys.
[44] Jung Ho Ahn,et al. Restructuring Batch Normalization to Accelerate CNN Training , 2018, SysML.
[45] Matthew Mattina,et al. SCALE-Sim: Systolic CNN Accelerator , 2018, ArXiv.
[46] Tianshi Chen,et al. Cambricon-S: Addressing Irregularity in Sparse Neural Networks through A Cooperative Software/Hardware Approach , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[47] Priyanka Raina,et al. DNN Dataflow Choice Is Overrated , 2018, ArXiv.
[48] Dylan Malone Stuart,et al. Memory Requirements for Convolutional Neural Network Hardware Accelerators , 2018, 2018 IEEE International Symposium on Workload Characterization (IISWC).
[49] Jiajun Li,et al. SmartShuttle: Optimizing off-chip memory accesses for deep learning accelerators , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[50] Yixin Chen,et al. Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.
[51] Christoforos E. Kozyrakis,et al. GraphP: Reducing Communication for PIM-Based Graph Processing with Efficient Data Partition , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[52] Xiao-Ming Wu,et al. Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.
[53] Alex Fout,et al. Protein Interface Prediction using Graph Convolutional Networks , 2017, NIPS.
[54] Chao Wang,et al. CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-Circulant Weight Matrices , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[55] Kiyoung Choi,et al. ExtraV: Boosting Graph Processing Near Storage with a Coherent Accelerator , 2017, Proc. VLDB Endow..
[56] Carole-Jean Wu,et al. MCM-GPU: Multi-chip-module GPUs for continued performance scalability , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[57] Jure Leskovec,et al. Inductive Representation Learning on Large Graphs , 2017, NIPS.
[58] William J. Dally,et al. SCNN: An accelerator for compressed-sparse convolutional neural networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[59] Patrick Judd,et al. Cnvlutin2: Ineffectual-Activation-and-Weight-Free Deep Neural Network Computing , 2017, ArXiv.
[60] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[61] Max Welling,et al. Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.
[62] C. Priebe,et al. Semi-External Memory Sparse Matrix Multiplication for Billion-Node Graphs , 2016, IEEE Transactions on Parallel and Distributed Systems.
[63] V. Sze,et al. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks , 2016, IEEE Journal of Solid-State Circuits.
[64] Manoj Alwani,et al. Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[65] Shaoli Liu,et al. Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[66] Margaret Martonosi,et al. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[67] Jure Leskovec,et al. node2vec: Scalable Feature Learning for Networks , 2016, KDD.
[68] Ozcan Ozturk,et al. Energy Efficient Architecture for Graph Analytics Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[69] Natalie D. Enright Jerger,et al. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[70] Vivienne Sze,et al. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[71] Natalia Gimelshein,et al. vDNN: Virtualized deep neural networks for scalable, memory-efficient neural network design , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[72] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[73] Pinar Yanardag,et al. Deep Graph Kernels , 2015, KDD.
[74] Wenguang Chen,et al. GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning , 2015, USENIX ATC.
[75] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[76] Kiyoung Choi,et al. A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[77] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[78] Jure Leskovec,et al. {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .
[79] Samuel Williams,et al. Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[80] Jure Leskovec,et al. Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.
[81] L. Takac. DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS , 2012 .
[82] J. Thomas Pawlowski,et al. Hybrid memory cube (HMC) , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).
[83] Xi Zhang,et al. A Cache Replacement Policy Using Adaptive Insertion and Re-reference Prediction , 2010, 2010 22nd International Symposium on Computer Architecture and High Performance Computing.
[84] Estevam R. Hruschka,et al. Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.
[85] Jure Leskovec,et al. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..
[86] Lise Getoor,et al. Collective Classification in Network Data , 2008, AI Mag..
[87] Christos Faloutsos,et al. Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.
[88] Rajeev Thakur,et al. Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..