Adaptive Tiling: Applying Fixed-size Systolic Arrays To Sparse Convolutional Neural Networks

We introduce adaptive tiling, a method of partitioning layers in a sparse convolutional neural network (CNN) into blocks of filters and channels, called tiles, each implementable with a fixed-size systolic array. By allowing a tile to adapt its size so that it can cover a large sparse area, we minimize the total number of tiles, or equivalently, the number of systolic array calls required to perform CNN inference. The proposed scheme resolves a challenge of applying systolic array architectures, traditionally designed for dense matrices, to sparse CNNs. To validate the approach, we construct a highly sparse Lasso-Mobile network by pruning MobileNet trained with an $\ell_{1}$ regularization penalty, and demonstrate that adaptive tiling can lead to a 2- $3\mathbf{x}$ reduction in systolic array calls, on Lasso-Mobile, for several benchmark datasets.

[1]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[2]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[4]  Gregory Frederick Diamos,et al.  Block-Sparse Recurrent Neural Networks , 2017, ArXiv.

[5]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[6]  H. T. Kung Why systolic architectures? , 1982, Computer.

[7]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[8]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[9]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[10]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[11]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[12]  Tianshi Chen,et al.  ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[13]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[15]  H. T. Kung,et al.  Systolic Arrays for (VLSI). , 1978 .

[16]  Berin Martini,et al.  NeuFlow: A runtime reconfigurable dataflow processor for vision , 2011, CVPR 2011 WORKSHOPS.