SoBS-X: Squeeze-Out Bit Sparsity for ReRAM-Crossbar-Based Neural Network Accelerator
暂无分享,去创建一个
[1] Yiran Chen,et al. IVQ: In-Memory Acceleration of DNN Inference Exploiting Varied Quantization , 2022, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[2] Li Jiang,et al. Bit-Transformer: Transforming Bit-level Sparsity into Higher Preformance in ReRAM-based Accelerator , 2021, 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD).
[3] Yanzhi Wang,et al. Improving Neural Network Efficiency via Post-training Quantization with Adaptive Floating-Point , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[4] Li Jiang,et al. IM3A: Boosting Deep Neural Network Efficiency via In-Memory Addressing-Assisted Acceleration , 2021, ACM Great Lakes Symposium on VLSI.
[5] Hang Liu,et al. FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[6] Wenbo Zhao,et al. SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network , 2021, 2021 IEEE 39th International Conference on Computer Design (ICCD).
[7] Yiran Chen,et al. BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization , 2021, ICLR.
[8] Xin Si,et al. 15.4 A 5.99-to-691.1TOPS/W Tensor-Train In-Memory-Computing Processor Using Bit-Level-Sparsity-Based Optimization and Variable-Precision Quantization , 2021, 2021 IEEE International Solid- State Circuits Conference (ISSCC).
[9] Yanzhi Wang,et al. Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework , 2020, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).
[10] Yiran Chen,et al. ReTransformer: ReRAM-based Processing-in-Memory Architecture for Transformer Acceleration , 2020, 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD).
[11] Yanzhi Wang,et al. PIM-Prune: Fine-Grain DCNN Pruning for Crossbar-Based Process-In-Memory Architecture , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).
[12] Yu Wang,et al. Low Bit-Width Convolutional Neural Network on RRAM , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[13] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[14] Yue Wang,et al. SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[15] Vijaykrishnan Narayanan,et al. GaaS-X: Graph Analytics Accelerator Supporting Sparse Data Representation using Crossbar Architectures , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[16] Yuan Xie,et al. Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey , 2020, Proceedings of the IEEE.
[17] Xiaochen Peng,et al. DNN+NeuroSim V2.0: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators for On-Chip Training , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[18] Bin Gao,et al. Fully hardware-implemented memristor convolutional neural network , 2020, Nature.
[19] Lei Deng,et al. SemiMap: A Semi-Folded Convolution Mapping for Speed-Overhead Balance on Crossbars , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[20] Yanzhi Wang,et al. PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning , 2020, ASPLOS.
[21] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[22] Wei Tang,et al. CASCADE: Connecting RRAMs to Extend Analog Dataflow In An End-To-End In-Memory Processing Paradigm , 2019, MICRO.
[23] Wei Wang,et al. Additive Powers-of-Two Quantization: An Efficient Non-uniform Discretization for Neural Networks , 2019, International Conference on Learning Representations.
[24] Chia-Lin Yang,et al. Sparse ReRAM Engine: Joint Exploration of Activation and Weight Sparsity in Compressed Neural Networks , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[25] Yanzhi Wang,et al. ResNet Can Be Pruned 60×: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning , 2019, 2019 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH).
[26] Yuan Xie,et al. FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture , 2019, ASPLOS.
[27] Yuan Xie,et al. Learning the sparsity for ReRAM: mapping and pruning sparse neural network for ReRAM based accelerator , 2019, ASP-DAC.
[28] Zhijian Liu,et al. HAQ: Hardware-Aware Automated Quantization With Mixed Precision , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Yuan Xie,et al. Crossbar-Aware Neural Network Pruning , 2018, IEEE Access.
[30] Yongqiang Lyu,et al. SNrram: An Efficient Sparse Neural Network Computation Architecture Based on Resistive Random-Access Memory , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).
[31] Scott A. Mahlke,et al. In-Memory Data Parallel Processor , 2018, ASPLOS.
[32] Yiran Chen,et al. ReCom: An efficient resistive accelerator for compressed deep neural networks , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[33] Meng-Fan Chang,et al. A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[34] Chi-Ying Tsui,et al. A high-throughput and energy-efficient RRAM-based convolutional neural network using data encoding and dynamic quantization , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).
[35] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[36] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[37] Yu Wang,et al. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[38] Miao Hu,et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[39] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[41] Meng-Fan Chang,et al. 19.4 embedded 1Mb ReRAM in 28nm CMOS with 0.27-to-1V read using swing-sample-and-couple sense amplifier and self-boost-write-termination scheme , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).
[42] Cong Xu,et al. NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[43] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[44] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .