DAISM: Digital Approximate In-SRAM Multiplier-based Accelerator for DNN Training and Inference
暂无分享,去创建一个
[1] Joo-Young Kim,et al. T-PIM: An Energy-Efficient Processing-in-Memory Accelerator for End-to-End On-Device Training , 2023, IEEE Journal of Solid-State Circuits.
[2] Leibo Liu,et al. SDP: Co-Designing Algorithm, Dataflow, and Architecture for In-SRAM Sparse NN Acceleration , 2023, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[3] Masaaki Kondo,et al. FAWS: Fault-Aware Weight Scheduler for DNN Computations in Heterogeneous and Faulty Hardware , 2022, 2022 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom).
[4] Marc Riera,et al. A Survey of Near-Data Processing Architectures for Neural Networks , 2021, Mach. Learn. Knowl. Extr..
[5] Sai Qian Zhang,et al. FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding , 2021, 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA).
[6] Joel Silberman,et al. RaPiD: AI Accelerator for Ultra-low Precision Training and Inference , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[7] P. B. Natrajan,et al. Approximate Multiplier Design with Encoded Partial Products , 2021, 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS).
[8] Juhyoung Lee,et al. Z-PIM: A Sparsity-Aware Processing-in-Memory Architecture With Fully Variable Weight Bit-Precision for Energy-Efficient Deep Neural Networks , 2021, IEEE Journal of Solid-State Circuits.
[9] Tao Luo,et al. Energy Efficient In-memory Integer Multiplication Based on Racetrack Memory , 2020, 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS).
[10] Cheol-Won Jo,et al. Bit-Serial multiplier based Neural Processing Element with Approximate adder tree , 2020, 2020 International SoC Design Conference (ISOCC).
[11] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[12] David Blaauw,et al. A 28-nm Compute SRAM With Bit-Serial Logic/Arithmetic Operations for Programmable In-Memory Vector Computing , 2020, IEEE Journal of Solid-State Circuits.
[13] Vivienne Sze,et al. Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[14] William J. Dally,et al. Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture , 2019, MICRO.
[15] Sander Stuijk,et al. Near-Memory Computing: Past, Present, and Future , 2019, Microprocess. Microsystems.
[16] Neil Burgess,et al. Bfloat16 Processing for Neural Networks , 2019, 2019 IEEE 26th Symposium on Computer Arithmetic (ARITH).
[17] Shinji Kimura,et al. Design of Power and Area Efficient Lower-Part-OR Approximate Multiplier , 2018, TENCON 2018 - 2018 IEEE Region 10 Conference.
[18] Gu-Yeon Wei,et al. Ares: A framework for quantifying the resilience of deep neural networks , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).
[19] Xin He,et al. AxTrain: Hardware-Oriented Neural Network Training for Approximate Inference , 2018, ISLPED.
[20] David Blaauw,et al. Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[21] Sparsh Mittal,et al. A Survey of ReRAM-Based Architectures for Processing-In-Memory and Neural Networks , 2018, Mach. Learn. Knowl. Extr..
[22] Yiran Chen,et al. ReRAM-based accelerator for deep learning , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[23] Joel Emer,et al. A method to estimate the energy consumption of deep neural networks , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.
[24] Maurizio Valle,et al. Approximate Multipliers Based on Inexact Adders for Energy Efficient Data Processing , 2017, 2017 New Generation of CAS (NGCAS).
[25] Bernard Girau,et al. Fault and Error Tolerance in Neural Networks: A Review , 2017, IEEE Access.
[26] David Blaauw,et al. A 0.3V VDDmin 4+2T SRAM for searching and in-memory computing using 55nm DDC technology , 2017, 2017 Symposium on VLSI Circuits.
[27] Alexandre Yakovlev,et al. Energy-efficient approximate multiplier design using bit significance-driven logic compression , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[28] David Blaauw,et al. Compute Caches , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[29] Fabrizio Lombardi,et al. Design and Performance Evaluation of Approximate Floating-Point Multipliers , 2016, 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).
[30] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[31] V. Sze,et al. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks , 2016, IEEE Journal of Solid-State Circuits.
[32] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[33] Qiang Xu,et al. ApproxANN: An approximate computing framework for artificial neural network , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[34] Henk Corporaal,et al. Memristor based computation-in-memory architecture for data-intensive applications , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[35] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[36] Andrew B. Kahng,et al. CACTI-IO: CACTI with off-chip power-area-timing models , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[37] X. Liao,et al. ReHy: A ReRAM-based Digital/Analog Hybrid PIM Architecture for Accelerating CNN Training , 2021, IEEE Transactions on Parallel and Distributed Systems.
[38] James Demmel,et al. IEEE Standard for Floating-Point Arithmetic , 2008 .