Reliable and Energy Efficient MLC STT-RAM Buffer for CNN Accelerators
暂无分享,去创建一个
Masoomeh Jasemi | Shaahin Hessabi | Nader Bagherzadeh | N. Bagherzadeh | S. Hessabi | Masoomeh Jasemi
[1] Kaushik Roy,et al. Approximate storage for energy efficient spintronic memories , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[2] Kaushik Roy,et al. Write-optimized reliable design of STT MRAM , 2012, ISLPED '12.
[3] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[4] Ning Ge,et al. TriZone: A Design of MLC STT-RAM Cache for Combined Performance, Energy, and Reliability Optimizations , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[5] Cong Xu,et al. NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[6] Yang Hu,et al. Towards Pervasive and User Satisfactory CNN across GPU Microarchitectures , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[7] Scott A. Mahlke,et al. In-Memory Data Parallel Processor , 2018, ASPLOS.
[8] Natalie D. Enright Jerger,et al. Reduced-Precision Strategies for Bounded Memory in Deep Neural Nets , 2015, ArXiv.
[9] Matthew Mattina,et al. SCALE-Sim: Systolic CNN Accelerator , 2018, ArXiv.
[10] Jason Cong,et al. Designing scratchpad memory architecture with emerging STT-RAM memory technologies , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).
[11] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[12] Yiran Chen,et al. Multi-level cell STT-RAM: Is it realistic or just a dream? , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[13] Masoomeh Jasemi,et al. Enhancing Reliability of Emerging MemoryTechnology for Machine Learning Accelerators , 2020 .
[14] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[15] Jacques-Olivier Klein,et al. Spin-Transfer Torque Magnetic Memory as a Stochastic Memristive Synapse for Neuromorphic Systems , 2015, IEEE Transactions on Biomedical Circuits and Systems.
[16] Liu Liu,et al. Building energy-efficient multi-level cell STT-RAM caches with data compression , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).
[17] Yanzhi Wang,et al. A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers , 2018, ECCV.
[18] Masoomeh Jasemi,et al. NoC Design Methodologies for Heterogeneous Architecture , 2020, 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP).
[19] Yiran Chen,et al. PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[20] Olivier Temam,et al. Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators , 2014, 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC).
[21] Hao Yan,et al. CELIA: A Device and Architecture Co-Design Framework for STT-MRAM-Based Deep Learning Acceleration , 2018, ICS.
[22] Gang Quan,et al. A statistical STT-RAM retention model for fast memory subsystem designs , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).
[23] MahlkeScott,et al. In-Memory Data Parallel Processor , 2018 .
[24] Pradeep Dubey,et al. SCALEDEEP: A scalable compute architecture for learning and evaluating deep networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[25] Jie Xu,et al. DeepBurning: Automatic generation of FPGA-based learning accelerators for the Neural Network family , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[26] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Masoomeh Jasemi,et al. A Radiation Hard Sense Circuit for Spin Transfer Torque Random Access Memory , 2019, 2019 IEEE International Symposium on Circuits and Systems (ISCAS).
[28] Gu-Yeon Wei,et al. MaxNVM: Maximizing DNN Storage Density and Inference Efficiency with Sparse Encoding and Error Mitigation , 2019, MICRO.
[29] Georg Heigold,et al. Small-footprint keyword spotting using deep neural networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Yiran Chen,et al. State-restrict MLC STT-RAM designs for high-reliable high-performance memory system , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).
[31] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[32] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[33] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[34] Deming Chen,et al. Debugging and verifying SoC designs through effective cross-layer hardware-software co-simulation , 2016, DAC.
[35] H. Noguchi,et al. Progress of STT-MRAM technology and the effect on normally-off computing systems , 2012, 2012 International Electron Devices Meeting.
[36] Chita R. Das,et al. Cache revive: Architecting volatile STT-RAM caches for enhanced performance in CMPs , 2012, DAC Design Automation Conference 2012.
[37] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[38] Masoomeh Jasemi,et al. Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks , 2020, 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP).
[39] Kaushik Roy,et al. Future cache design using STT MRAMs for improved energy efficiency: Devices, circuits and architecture , 2012, DAC Design Automation Conference 2012.
[40] Ying Wang,et al. STT-RAM Buffer Design for Precision-Tunable General-Purpose Neural Network Accelerator , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[41] Liang Shi,et al. Two-step state transition minimization for lifetime and performance improvement on MLC STT-RAM , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[42] H.-S. Philip Wong,et al. On-Chip Memory Technology Design Space Explorations for Mobile Deep Neural Network Accelerators , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[43] A. Krizhevsky. Convolutional Deep Belief Networks on CIFAR-10 , 2010 .
[44] Yu Wang,et al. Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.