Optimizing Weight Mapping and Data Flow for Convolutional Neural Networks on Processing-in-Memory Architectures
暂无分享,去创建一个
Xiaochen Peng | Shimeng Yu | Rui Liu | Shimeng Yu | Xiaochen Peng | Rui Liu
[1] Shimeng Yu,et al. Neuro-Inspired Computing With Emerging Nonvolatile Memorys , 2018, Proceedings of the IEEE.
[2] Marian Verhelst,et al. 5 ENVISION : A 0 . 26-to-10 TOPS / W Subword-Parallel Dynamic-Voltage-Accuracy-Frequency-Scalable Convolutional Neural Network Processor in 28 nm FDSOI , 2017 .
[3] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[4] Chung-Cheng Chou,et al. An N40 256K×44 embedded RRAM macro with SL-precharge SA and low-voltage current limiter to improve read and write performance , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[5] Hoi-Jun Yoo,et al. 14.2 DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).
[6] Shimeng Yu,et al. Emerging Memory Technologies: Recent Trends and Prospects , 2016, IEEE Solid-State Circuits Magazine.
[7] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[8] Xiaochen Peng,et al. Optimizing Weight Mapping and Data Flow for Convolutional Neural Networks on RRAM Based Processing-In-Memory Architecture , 2019, 2019 IEEE International Symposium on Circuits and Systems (ISCAS).
[9] Rajiv V. Joshi,et al. An Energy-Efficient Digital ReRAM-Crossbar-Based CNN With Bitwise Parallelism , 2017, IEEE Journal on Exploratory Solid-State Computational Devices and Circuits.
[10] Shimeng Yu,et al. Technology-design co-optimization of resistive cross-point array for accelerating learning algorithms on chip , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[11] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[12] Shuang Wu,et al. Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit Integers , 2020, Neural Networks.
[13] Meng-Fan Chang,et al. 24.5 A Twin-8T SRAM Computation-In-Memory Macro for Multiple-Bit CNN-Based Machine Learning , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).
[14] Yuangang Wang,et al. A Highly Parallel and Energy Efficient Three-Dimensional Multilayer CMOS-RRAM Accelerator for Tensorized Neural Network , 2018, IEEE Transactions on Nanotechnology.
[15] Marian Verhelst,et al. 14.5 Envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable Convolutional Neural Network processor in 28nm FDSOI , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).
[16] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[17] Miao Hu,et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[18] Xiaochen Peng,et al. XNOR-RRAM: A scalable and parallel resistive synaptic architecture for binary neural networks , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[19] Tao Zhang,et al. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[20] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, ArXiv.
[21] Tayfun Gokmen,et al. Training Deep Convolutional Neural Networks with Resistive Cross-Point Devices , 2017, Front. Neurosci..
[22] Heng-Yuan Lee,et al. A 4Mb embedded SLC resistive-RAM macro with 7.2ns read-write random-access time and 160ns MLC-access capability , 2011, 2011 IEEE International Solid-State Circuits Conference.
[23] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.
[24] Xiaochen Peng,et al. NeuroSim: A Circuit-Level Macro Model for Benchmarking Neuro-Inspired Architectures in Online Learning , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[25] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Ligang Gao,et al. Programming Protocol Optimization for Analog Weight Tuning in Resistive Memories , 2015, IEEE Electron Device Letters.
[27] Shimeng Yu,et al. Partition SRAM and RRAM based synaptic arrays for neuro-inspired computing , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).
[28] Shuang Wu,et al. Training and Inference with Integers in Deep Neural Networks , 2018, ICLR.
[29] Yiran Chen,et al. PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).