NoC-based DNN accelerator: a future design paradigm
暂无分享,去创建一个
Kun-Chih Chen | Masoumeh Ebrahimi | Ting-Yi Wang | Yuch-Chi Yang | M. Ebrahimi | K. Chen | Yuch-Chi Yang | Tingting Wang
[1] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[2] Masoud Daneshtalab,et al. EbDa: A new theory on design and verification of deadlock-free interconnection networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[3] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.
[4] Jim D. Garside,et al. SpiNNaker: A 1-W 18-Core System-on-Chip for Massively-Parallel Neural Network Simulation , 2013, IEEE Journal of Solid-State Circuits.
[5] Yann LeCun,et al. An FPGA-based stream processor for embedded real-time vision with Convolutional Networks , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.
[6] Aline. Vieira-de-Mello. ATLAS-An Environment for NoC Generation and Evaluation , 2011 .
[7] Nan Jiang,et al. A detailed and flexible cycle-accurate Network-on-Chip simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[8] Sven Behnke,et al. Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.
[9] Liam McDaid,et al. Scalable Hierarchical Network-on-Chip Architecture for Spiking Neural Network Hardware Implementations , 2013, IEEE Transactions on Parallel and Distributed Systems.
[10] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .
[11] Christopher R'e,et al. Caffe con Troll: Shallow Ideas to Speed Up Deep Learning , 2015, DanaC@SIGMOD.
[12] Toru Baji,et al. Evolution of the GPU Device widely used in AI and Massive Parallel Processing , 2018, 2018 IEEE 2nd Electron Devices Technology and Manufacturing Conference (EDTM).
[13] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[14] Marian Verhelst,et al. A 0.3–2.6 TOPS/W precision-scalable processor for real-time large-scale ConvNets , 2016, 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits).
[15] Hyoukjun Kwon,et al. Rethinking NoCs for spatial neural network accelerators , 2017, 2017 Eleventh IEEE/ACM International Symposium on Networks-on-Chip (NOCS).
[16] Masoud Daneshtalab,et al. CuPAN - High Throughput On-chip Interconnection for Neural Networks , 2015, ICONIP.
[17] Ning Li,et al. A statistic approach for power analysis of integrated GPU , 2019, Soft Comput..
[18] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Masoud Daneshtalab,et al. Reconfigurable communication fabric for efficient implementation of neural networks , 2015, 2015 10th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC).
[20] Hadi Esmaeilzadeh,et al. Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Network , 2017, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[21] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[22] Qiong Wu,et al. Testing aware dynamic mapping for path-centric network-on-chip test , 2019, Integr..
[23] Leibo Liu,et al. A 1.06-to-5.09 TOPS/W reconfigurable hybrid-neural-network processor for deep learning applications , 2017, 2017 Symposium on VLSI Circuits.
[24] Tianshi Chen,et al. DaDianNao: A Neural Network Supercomputer , 2017, IEEE Transactions on Computers.
[25] Efraim Rotem,et al. Inside 6th-Generation Intel Core: New Microarchitecture Code-Named Skylake , 2017, IEEE Micro.
[26] Hoi-Jun Yoo,et al. UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision , 2019, IEEE Journal of Solid-State Circuits.
[27] Vivienne Sze,et al. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[28] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[29] Yiran Chen,et al. Neu-NoC: A high-efficient interconnection network for accelerated neuromorphic systems , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).
[30] Vivienne Sze,et al. Eyeriss v2: A Flexible and High-Performance Accelerator for Emerging Deep Neural Networks , 2018, ArXiv.
[31] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[32] Masoud Daneshtalab,et al. Routing Algorithms in Networks-on-Chip , 2013 .
[33] Luis Ceze,et al. Neural Acceleration for General-Purpose Approximate Programs , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[34] Ce Liu,et al. Deep Convolutional Neural Network for Image Deconvolution , 2014, NIPS.
[35] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[36] Salvatore Monteleone,et al. Cycle-Accurate Network on Chip Simulation with Noxim , 2016, ACM Trans. Model. Comput. Simul..
[37] Kun-Chih Jimmy Chen,et al. NN-Noxim: High-Level Cycle-Accurate NoC-based Neural Networks Simulator , 2018, 2018 11th International Workshop on Network on Chip Architectures (NoCArc).
[38] Kun-Chih Jimmy Chen,et al. Cycle-Accurate NoC-based Convolutional Neural Network Simulator , 2019, COINS.
[39] Leibo Liu,et al. A High Energy Efficient Reconfigurable Hybrid Neural Network Processor for Deep Learning Applications , 2018, IEEE Journal of Solid-State Circuits.
[40] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[41] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[42] Luca P. Carloni,et al. NoC-Based Support of Heterogeneous Cache-Coherence Models for Accelerators , 2018, 2018 Twelfth IEEE/ACM International Symposium on Networks-on-Chip (NOCS).
[43] Zhenyu Liu,et al. High-Performance FPGA-Based CNN Accelerator With Block-Floating-Point Arithmetic , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[44] Yang Wang,et al. BigDL: A Distributed Deep Learning Framework for Big Data , 2018, SoCC.
[45] Liam McDaid,et al. Scalable Networks-on-Chip Interconnected Architecture for Astrocyte-Neuron Networks , 2016, IEEE Transactions on Circuits and Systems I: Regular Papers.
[46] Qiang Li,et al. A lifetime-aware mapping algorithm to extend MTTF of Networks-on-Chip , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).
[47] Hannu Tenhunen,et al. HARAQ: Congestion-Aware Learning Model for Highly Adaptive Routing Algorithm in On-Chip Networks , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.
[48] Xuehai Zhou,et al. PuDianNao: A Polyvalent Machine Learning Accelerator , 2015, ASPLOS.