论文信息 - Adaptive Tiling: Applying Fixed-size Systolic Arrays To Sparse Convolutional Neural Networks

Adaptive Tiling: Applying Fixed-size Systolic Arrays To Sparse Convolutional Neural Networks

We introduce adaptive tiling, a method of partitioning layers in a sparse convolutional neural network (CNN) into blocks of filters and channels, called tiles, each implementable with a fixed-size systolic array. By allowing a tile to adapt its size so that it can cover a large sparse area, we minimize the total number of tiles, or equivalently, the number of systolic array calls required to perform CNN inference. The proposed scheme resolves a challenge of applying systolic array architectures, traditionally designed for dense matrices, to sparse CNNs. To validate the approach, we construct a highly sparse Lasso-Mobile network by pruning MobileNet trained with an $\ell_{1}$ regularization penalty, and demonstrate that adaptive tiling can lead to a 2- $3\mathbf{x}$ reduction in systolic array calls, on Lasso-Mobile, for several benchmark datasets.

H. T. Kung | Sai Qian Zhang | Bradley McDanel | S. Zhang | Bradley McDanel

[1] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[2] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Sebastian Ruder,et al. An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[4] Gregory Frederick Diamos,et al. Block-Sparse Recurrent Neural Networks , 2017, ArXiv.

[5] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[6] H. T. Kung. Why systolic architectures? , 1982, Computer.

[7] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[8] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[9] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[10] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[11] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .

[12] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[13] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .

[15] H. T. Kung,et al. Systolic Arrays for (VLSI). , 1978 .

[16] Berin Martini,et al. NeuFlow: A runtime reconfigurable dataflow processor for vision , 2011, CVPR 2011 WORKSHOPS.