ESCALATE: Boosting the Efficiency of Sparse CNN Accelerator with Kernel Decomposition
暂无分享,去创建一个
Xuehai Qian | Shiyu Li | Hai Li | Edward Hanson | Yiran Chen | Yiran Chen | H. Li | Xuehai Qian | Shiyu Li | Edward Hanson | Xuehai Qian
[1] Dylan Malone Stuart,et al. Laconic Deep Learning Inference Acceleration , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[2] T. N. Vijaykumar,et al. SparTen: A Sparse Tensor Accelerator for Convolutional Neural Networks , 2019, MICRO.
[3] Natalie D. Enright Jerger,et al. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[4] Jiayu Li,et al. ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Methods of Multipliers , 2018, ASPLOS.
[5] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.
[6] Joel Emer,et al. Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks , 2016, CARN.
[7] Ji Liu,et al. Lossless CNN Channel Pruning via Gradient Resetting and Convolutional Re-parameterization , 2020, ArXiv.
[8] Ruby B. Lee,et al. Fast Bit Gather, Bit Scatter and Bit Permutation Instructions for Commodity Microprocessors , 2008, J. Signal Process. Syst..
[9] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[10] Onur Mutlu,et al. Ramulator: A Fast and Extensible DRAM Simulator , 2016, IEEE Computer Architecture Letters.
[11] Vivienne Sze,et al. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[12] Yanzhi Wang,et al. A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers , 2018, ECCV.
[13] Jian Sun,et al. Accelerating Very Deep Convolutional Networks for Classification and Detection , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[14] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[15] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[16] Andrew B. Kahng,et al. CACTI 7 , 2017, ACM Trans. Archit. Code Optim..
[17] Song Han,et al. Trained Ternary Quantization , 2016, ICLR.
[18] Patrick Judd,et al. Bit-Tactical: A Software/Hardware Approach to Exploiting Value and Bit Sparsity in Neural Networks , 2019, ASPLOS.
[19] Hai Li,et al. PENNI: Pruned Kernel Sharing for Efficient CNN Inference , 2020, ICML.
[20] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[21] Aamer Jaleel,et al. ExTensor: An Accelerator for Sparse Tensor Algebra , 2019, MICRO.
[22] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Dipankar Das,et al. SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[24] Ivan V. Oseledets,et al. Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition , 2014, ICLR.
[25] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[26] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[27] Ali Farhadi,et al. Soft Threshold Weight Reparameterization for Learnable Sparsity , 2020, ICML.
[28] William J. Dally,et al. SCNN: An accelerator for compressed-sparse convolutional neural networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[29] Vivienne Sze,et al. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks , 2017, IEEE Journal of Solid-State Circuits.
[30] Shaoli Liu,et al. Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[31] Brucek Khailany,et al. Timeloop: A Systematic Approach to DNN Accelerator Evaluation , 2019, 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[32] Yanzhi Wang,et al. PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning , 2020, ASPLOS.
[33] Xuehai Qian,et al. Non-Structured DNN Weight Pruning--Is It Beneficial in Any Platform? , 2021, IEEE transactions on neural networks and learning systems.
[34] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[35] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[36] Eunhyeok Park,et al. Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications , 2015, ICLR.
[37] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[38] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[39] Tianshi Chen,et al. Cambricon-S: Addressing Irregularity in Sparse Neural Networks through A Cooperative Software/Hardware Approach , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[40] Andreas Moshovos,et al. Bit-Pragmatic Deep Neural Network Computing , 2016, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).