Extending Sparse Tensor Accelerators to Support Multiple Compression Formats
暂无分享,去创建一个
Dipankar Das | Sheng-Chun Kao | Sivasankaran Rajamanickam | Hyoukjun Kwon | Tushar Krishna | Sudarshan Srinivasan | Gordon E. Moon | Eric Qin | Geonhwa Jeong | William Won | Dipankar Das | S. Rajamanickam | S. Srinivasan | T. Krishna | Sheng-Chun Kao | Hyoukjun Kwon | William Won | Geonhwa Jeong | G. Moon | Eric Qin
[1] Satoshi Matsuoka,et al. Batched Sparse Matrix Multiplication for Accelerating Graph Convolutional Networks , 2019, 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).
[2] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[3] Saman P. Amarasinghe,et al. Format abstraction for sparse tensor algebra compilers , 2018, Proc. ACM Program. Lang..
[4] Tom Michael Mitchell,et al. Predicting Human Brain Activity Associated with the Meanings of Nouns , 2008, Science.
[5] George Karypis,et al. Tensor-matrix products with a compressed sparse tensor , 2015, IA3@SC.
[6] Michael Garland,et al. Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[7] Onur Mutlu,et al. SMASH: Co-designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations , 2019, MICRO.
[8] Alexander Heinecke,et al. Harnessing Deep Learning via a Single Building Block , 2020, 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[9] William J. Dally,et al. SCNN: An accelerator for compressed-sparse convolutional neural networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[10] Vivienne Sze,et al. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks , 2017, IEEE Journal of Solid-State Circuits.
[11] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[12] Vivienne Sze,et al. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[13] Alper Buyuktosunoglu,et al. Data Compression Accelerator on IBM POWER9 and z15 Processors : Industrial Product , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[14] Nazli Goharian,et al. Comparative Analysis of Sparse Matrix Algorithms For Information Retrieval , 2003 .
[15] Jimeng Sun,et al. HiCOO: Hierarchical Storage of Sparse Tensors , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.
[16] Shaoli Liu,et al. Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[17] Stephen W. Keckler,et al. Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks , 2017, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[18] Sivasankaran Rajamanickam,et al. Towards Extreme-Scale Simulations for Low Mach Fluids with Second-Generation Trilinos , 2014, Parallel Process. Lett..
[19] Bahar Asgari,et al. ALRESCHA: A Lightweight Reconfigurable Sparse-Computation Accelerator , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[20] Richard W. Vuduc,et al. Optimizing Sparse Tensor Times Matrix on Multi-core and Many-Core Architectures , 2016, 2016 6th Workshop on Irregular Applications: Architecture and Algorithms (IA3).
[21] Anand D. Sarwate,et al. A Unified Optimization Approach for Sparse Tensor Operations on GPUs , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).
[22] David Blaauw,et al. OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[23] Timothy A. Davis,et al. The university of Florida sparse matrix collection , 2011, TOMS.
[24] Mark Horowitz,et al. 1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).
[25] Cody Coleman,et al. MLPerf Inference Benchmark , 2019, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[26] Aamer Jaleel,et al. ExTensor: An Accelerator for Sparse Tensor Algebra , 2019, MICRO.
[27] Suyog Gupta,et al. To prune, or not to prune: exploring the efficacy of pruning for model compression , 2017, ICLR.
[28] Dipankar Das,et al. SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[29] Song Han,et al. SpArch: Efficient Architecture for Sparse Matrix Multiplication , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[30] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[31] Nitish Srivastava,et al. Tensaurus: A Versatile Accelerator for Mixed Sparse-Dense Tensor Computations , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[32] David Gregg,et al. Parallel Multi Channel convolution using General Matrix Multiplication , 2017, 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP).