Spartan: A Sparsity-Adaptive Framework to Accelerate Deep Neural Network Training on GPUs
暂无分享,去创建一个
David Kaeli | José L. Abellán | Daniel Lowell | Nicolas Bohm Agostini | José Cano | Yifan Sun | Shi Dong | Elmira Karimi | Jing Zhou | D. Kaeli | Yifan Sun | Daniel Lowell | Shi Dong | José Cano | Elmira Karimi | Jing Zhou
[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[2] Tianshi Chen,et al. Cambricon-S: Addressing Irregularity in Sparse Neural Networks through A Cooperative Software/Hardware Approach , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[3] Stephen W. Keckler,et al. Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks , 2017, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[4] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Chia-Lin Yang,et al. Sparse ReRAM Engine: Joint Exploration of Activation and Weight Sparsity in Compressed Neural Networks , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[6] David R. Kaeli,et al. Evaluating Performance Tradeoffs on the Radeon Open Compute Platform , 2018, 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[7] Yuan Xie,et al. Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey , 2020, Proceedings of the IEEE.
[8] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[9] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[10] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[11] Vivienne Sze,et al. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[12] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[13] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Edition , 2018, Xenophanes von Kolophon.
[15] Peter A. Beerel,et al. Pre-Defined Sparse Neural Networks With Hardware Acceleration , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[16] Chao Liu,et al. MIOpen: An Open Source Library For Deep Learning Primitives , 2019, Proceedings of the 30th International Conference on Computer Graphics and Machine Vision (GraphiCon 2020). Part 2.
[17] David R. Kaeli,et al. Characterizing the Microarchitectural Implications of a Convolutional Neural Network (CNN) Execution on GPUs , 2018, ICPE.
[18] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[19] Scott A. Mahlke,et al. Scalpel: Customizing DNN pruning to the underlying hardware parallelism , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[20] Amar Phanishayee,et al. Gist: Efficient Data Encoding for Deep Neural Network Training , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[21] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[22] Yong Wang,et al. SDA: Software-defined accelerator for large-scale DNN systems , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).
[23] Dipankar Das,et al. SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[24] Song Han,et al. ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA , 2016, FPGA.
[25] Chen Zhang,et al. Efficient and Effective Sparse LSTM on FPGA with Bank-Balanced Sparsity , 2019, FPGA.
[26] Bin Yang,et al. SBNet: Sparse Blocks Network for Fast Inference , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[27] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[28] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[29] Amos J. Storkey,et al. Characterising Across-Stack Optimisations for Deep Convolutional Neural Networks , 2018, 2018 IEEE International Symposium on Workload Characterization (IISWC).
[30] John R. Gilbert,et al. Sparse Matrices in MATLAB: Design and Implementation , 1992, SIAM J. Matrix Anal. Appl..
[31] David R. Kaeli,et al. Heterogeneous Computing with OpenCL - Revised OpenCL 1.2 Edition , 2012 .
[32] Jung Ho Ahn,et al. The Design Space of Data-Parallel Memory Systems , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[33] Weixing Ji,et al. A Systematic Survey of General Sparse Matrix-Matrix Multiplication , 2020, ArXiv.
[34] David R. Kaeli,et al. DNNMark: A Deep Neural Network Benchmark Suite for GPUs , 2017, GPGPU@PPoPP.
[35] Luca Benini,et al. Extended Bit-Plane Compression for Convolutional Neural Network Accelerators , 2018, 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS).
[36] Quan Chen,et al. DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[37] Zhongliang Chen,et al. MGPUSim: Enabling Multi-GPU Performance Modeling and Optimization , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[38] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[39] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.
[40] Ester M. Garzón,et al. Improving the Performance of the Sparse Matrix Vector Product with GPUs , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.
[41] Francesco Visin,et al. A guide to convolution arithmetic for deep learning , 2016, ArXiv.