Mapping and virtual neuron assignment algorithms for MAERI accelerator
暂无分享,去创建一个
[1] Kun-Chih Jimmy Chen,et al. Cycle-Accurate NoC-based Convolutional Neural Network Simulator , 2019, COINS.
[2] Kun-Chih Chen,et al. NoC-based DNN accelerator: a future design paradigm , 2019, NOCS.
[3] Vivek Sarkar,et al. Understanding Reuse, Performance, and Hardware Cost of DNN Dataflow: A Data-Centric Approach , 2018, MICRO.
[4] Zhigang Mao,et al. mRNA: Enabling Efficient Mapping Space Exploration for a Reconfiguration Neural Accelerator , 2019, 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[5] Salvatore Monteleone,et al. DNNZip: Selective Layers Compression Technique in Deep Neural Network Accelerators , 2020, 2020 23rd Euromicro Conference on Digital System Design (DSD).
[6] Karthikeyan Sankaralingam,et al. A general constraint-centric scheduling framework for spatial architectures , 2013, PLDI.
[7] Christoforos E. Kozyrakis,et al. TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.
[8] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[9] Matthew Mattina,et al. SCALE-Sim: Systolic CNN Accelerator , 2018, ArXiv.
[10] Dipankar Das,et al. SIGMA: A Sparse and Irregular GEMM Accelerator with Flexible Interconnects for DNN Training , 2020, 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[11] John Jose,et al. Improving Inference Latency and Energy of Network-on-Chip based Convolutional Neural Networks through Weights Compression , 2020, 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[12] Salvatore Monteleone,et al. Cycle-Accurate Network on Chip Simulation with Noxim , 2016, ACM Trans. Model. Comput. Simul..
[13] Hyoukjun Kwon,et al. MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects , 2018, ASPLOS.
[14] Karthikeyan Sankaralingam,et al. Stream-dataflow acceleration , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[15] Xiaowei Li,et al. FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[16] Xuehai Zhou,et al. PuDianNao: A Polyvalent Machine Learning Accelerator , 2015, ASPLOS.
[17] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[18] Qi Yu,et al. DLAU: A Scalable Deep Learning Accelerator Unit on FPGA , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[19] Midia Reshadi,et al. Flow mapping and data distribution on mesh-based deep learning accelerator , 2019, NOCS.
[20] Kun-Chih Chen,et al. A NoC-based simulator for design and evaluation of deep neural networks , 2020, Microprocess. Microsystems.
[21] Midia Reshadi,et al. Flow mapping on mesh-based deep learning accelerator , 2020, J. Parallel Distributed Comput..
[22] Masoud Daneshtalab,et al. Reconfigurable Network-on-Chip for 3D Neural Network Accelerators , 2018, 2018 Twelfth IEEE/ACM International Symposium on Networks-on-Chip (NOCS).
[23] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[24] Hyoukjun Kwon,et al. MAESTRO: An Open-source Infrastructure for Modeling Dataflows within Deep Learning Accelerators , 2018, ArXiv.
[25] PuDianNao: A Polyvalent Machine Learning Accelerator , 2015, ASPLOS.
[26] C. Kozyrakis,et al. TETRIS , 2017 .
[27] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.
[28] Hyoukjun Kwon,et al. Rethinking NoCs for spatial neural network accelerators , 2017, 2017 Eleventh IEEE/ACM International Symposium on Networks-on-Chip (NOCS).
[29] DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[30] Joel Emer,et al. Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .
[31] Karthikeyan Sankaralingam,et al. A general constraint-centric scheduling framework for spatial architectures , 2013, PLDI.
[32] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[33] Luca Benini,et al. Hyperdrive: A Systolically Scalable Binary-Weight CNN Inference Engine for mW IoT End-Nodes , 2018, 2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).
[34] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.