Exploration of task-based scheduling for convolutional neural networks accelerators under memory constraints
暂无分享,去创建一个
Crefeda Faviola Rodrigues | Mikel Luján | Graham Riley | Crefeda Faviola Rodrigues | M. Luján | G. Riley
[1] Xuan Yang,et al. A Systematic Approach to Blocking Convolutional Neural Networks , 2016, ArXiv.
[2] Manoj Alwani,et al. Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[3] Christoforos E. Kozyrakis,et al. TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.
[4] Yves Robert,et al. Memory-Aware List Scheduling for Hybrid Platforms , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.
[5] Salim Hariri,et al. Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..
[6] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[7] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[8] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[9] Joel Emer,et al. Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks , 2016, CARN.
[10] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[11] Nicholas D. Lane,et al. An Early Resource Characterization of Deep Learning on Wearables, Smartphones and Internet-of-Things Devices , 2015, IoT-App@SenSys.
[12] Matthew Mattina,et al. SCALE-Sim: Systolic CNN Accelerator , 2018, ArXiv.
[13] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Luca Benini,et al. Optimal Tiling Strategy for Memory Bandwidth Reduction for CNNs , 2017, ACIVS.
[15] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[16] Matthew Mattina,et al. SCALE-Sim: Systolic CNN Accelerator , 2018, ArXiv.
[17] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[18] Luca Benini,et al. Optimally Scheduling CNN Convolutions for Efficient Memory Access , 2019, ArXiv.