IVQ: In-Memory Acceleration of DNN Inference Exploiting Varied Quantization
暂无分享,去创建一个
Yiran Chen | Zongwu Wang | Fangxin Liu | Li Jiang | Yilong Zhao | Tao Yang | Wenbo Zhao
[1] Li Jiang,et al. Bit-Transformer: Transforming Bit-level Sparsity into Higher Preformance in ReRAM-based Accelerator , 2021, 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD).
[2] Yanzhi Wang,et al. Improving Neural Network Efficiency via Post-training Quantization with Adaptive Floating-Point , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[3] Li Jiang,et al. IM3A: Boosting Deep Neural Network Efficiency via In-Memory Addressing-Assisted Acceleration , 2021, ACM Great Lakes Symposium on VLSI.
[4] Hang Liu,et al. FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[5] Wenbo Zhao,et al. SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network , 2021, 2021 IEEE 39th International Conference on Computer Design (ICCD).
[6] Yiran Chen,et al. BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization , 2021, ICLR.
[7] Xin Si,et al. 15.4 A 5.99-to-691.1TOPS/W Tensor-Train In-Memory-Computing Processor Using Bit-Level-Sparsity-Based Optimization and Variable-Precision Quantization , 2021, 2021 IEEE International Solid- State Circuits Conference (ISSCC).
[8] Yanzhi Wang,et al. Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework , 2020, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).
[9] Li Jiang,et al. DRQ: Dynamic Region-based Quantization for Deep Neural Network Acceleration , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[10] Yuan Xie,et al. Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey , 2020, Proceedings of the IEEE.
[11] Bin Gao,et al. Fully hardware-implemented memristor convolutional neural network , 2020, Nature.
[12] Wei Tang,et al. CASCADE: Connecting RRAMs to Extend Analog Dataflow In An End-To-End In-Memory Processing Paradigm , 2019, MICRO.
[13] Zhiru Zhang,et al. Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating , 2019, MICRO.
[14] Wei Wang,et al. Additive Powers-of-Two Quantization: An Efficient Non-uniform Discretization for Neural Networks , 2019, International Conference on Learning Representations.
[15] Anand Raghunathan,et al. X-MANN: A Crossbar based Architecture for Memory Augmented Neural Networks , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[16] Swagath Venkataramani,et al. BiScaled-DNN: Quantizing Long-tailed Datastructures with Two Scale Factors for Deep Neural Networks , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[17] Jie Lin,et al. Noise Injection Adaption: End-to-End ReRAM Crossbar Non-ideal Effect Adaption for Neural Network Mapping , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[18] Chia-Lin Yang,et al. Sparse ReRAM Engine: Joint Exploration of Activation and Weight Sparsity in Compressed Neural Networks , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[19] Kurt Keutzer,et al. HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[20] Steven K. Esser,et al. Learned Step Size Quantization , 2019, ICLR.
[21] Yun Liang,et al. REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs , 2019, FPGA.
[22] Dejan S. Milojicic,et al. PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference , 2019, ASPLOS.
[23] Zhijian Liu,et al. HAQ: Hardware-Aware Automated Quantization With Mixed Precision , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Deliang Fan,et al. Simultaneously Optimizing Weight and Quantizer of Ternary Neural Network Using Truncated Gaussian Approximation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Magnus Själander,et al. BISMO: A Scalable Bit-Serial Matrix Multiplication Overlay for Reconfigurable Computing , 2018, 2018 28th International Conference on Field Programmable Logic and Applications (FPL).
[26] Scott A. Mahlke,et al. In-Memory Data Parallel Processor , 2018, ASPLOS.
[27] Rajeev Balasubramonian,et al. Newton: Gravitating Towards the Physical Limits of Crossbar Acceleration , 2018, IEEE Micro.
[28] Swagath Venkataramani,et al. PACT: Parameterized Clipping Activation for Quantized Neural Networks , 2018, ArXiv.
[29] Meng-Fan Chang,et al. A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[30] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[31] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[32] Hadi Esmaeilzadeh,et al. Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Network , 2017, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[33] Wei Pan,et al. Towards Accurate Binary Convolutional Neural Network , 2017, NIPS.
[34] Shenghuo Zhu,et al. Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM , 2017, AAAI.
[35] Jan Reineke,et al. Ascertaining Uncertainty for Efficient Exact Cache Analysis , 2017, CAV.
[36] Yiran Chen,et al. PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[37] Lin Xu,et al. Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights , 2017, ICLR.
[38] Yu Wang,et al. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[39] Miao Hu,et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[40] Vivienne Sze,et al. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[41] Bin Liu,et al. Ternary Weight Networks , 2016, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[42] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.
[43] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, NIPS.
[44] Jian Cheng,et al. Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.
[47] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[48] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[49] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[50] Yiran Chen,et al. Reduction and IR-drop compensations techniques for reliable neuromorphic computing systems , 2014, 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[51] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[52] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[53] Cong Xu,et al. NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[54] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[55] Norman P. Jouppi,et al. Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[56] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .