Analysis of a Pipelined Architecture for Sparse DNNs on Embedded Systems
暂无分享,去创建一个
Javier Resano | Hortensia Mecha | Javier Olivito | Adrián Alcolea Moreno | J. Resano | H. Mecha | J. Olivito
[1] Natalie D. Enright Jerger,et al. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[2] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[3] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.
[4] Pipelined architecture for sparse DNNs , 2019 .
[5] Richard Vuduc,et al. Automatic performance tuning of sparse matrix kernels , 2003 .
[6] Joel Emer,et al. Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .
[7] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[8] Christos-Savvas Bouganis,et al. Toolflows for Mapping Convolutional Neural Networks on FPGAs , 2018, ACM Comput. Surv..
[9] Avesta Sasan,et al. NESTA: Hamming Weight Compression-Based Neural Proc. EngineAli Mirzaeian , 2019, 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC).
[10] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[11] Scott A. Mahlke,et al. Scalpel: Customizing DNN pruning to the underlying hardware parallelism , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[12] Sparsh Mittal,et al. A survey of FPGA-based accelerators for convolutional neural networks , 2018, Neural Computing and Applications.
[13] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[14] Tobi Delbrück,et al. ADaPTION: Toolbox and Benchmark for Training Convolutional Neural Networks with Reduced Numerical Precision Weights and Activation , 2017, ArXiv.
[15] Vivienne Sze,et al. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[16] Song Han,et al. Trained Ternary Quantization , 2016, ICLR.
[17] Avesta Sasan,et al. TCD-NPE: A Re-configurable and Efficient Neural Processing Engine, Powered by Novel Temporal-Carry-deferring MACs , 2019, 2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig).
[18] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[19] Nakanishi Hirofumi,et al. WT210/WT230 DIGITAL POWER METERS , 2003 .
[20] Soheil Ghiasi,et al. Hardware-oriented Approximation of Convolutional Neural Networks , 2016, ArXiv.
[21] Vivienne Sze,et al. Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] William J. Dally,et al. SCNN: An accelerator for compressed-sparse convolutional neural networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[23] Mengjia Yan,et al. UCNN: Exploiting Computational Reuse in Deep Neural Networks via Weight Repetition , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[24] Shaoli Liu,et al. Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[25] Suyog Gupta,et al. To prune, or not to prune: exploring the efficacy of pruning for model compression , 2017, ICLR.
[26] Xiaowei Li,et al. SqueezeFlow: A Sparse CNN Accelerator Exploiting Concise Convolution Rules , 2019, IEEE Transactions on Computers.
[27] Alessandro Aimar,et al. NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[28] Lin Xu,et al. Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights , 2017, ICLR.
[29] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[30] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.