Fast algorithm using summed area tables with unified layer performing convolution and average pooling

Convolutional neural networks (CNNs), in which several convolutional layers extract feature patterns from an input image, are one of the most popular network architectures used for image classification. The convolutional computation, however, requires a high computational cost, resulting in an increased power consumption and processing time. In this paper, we propose a novel algorithm that substitutes a single layer for a pair formed by a convolutional layer and the following average-pooling layer. The key idea of the proposed scheme is to compute the output of the pair of original layers without the computation of convolution. To achieve this end, our algorithm generates summed area tables (SATs) of input images first and directly computes the output values from the SATs. We implemented our algorithm for forward propagation and backward propagation to evaluate the performance. Our experimental results showed that our algorithm achieved 17.1 times faster performance than the original algorithm for the same parameter used in ResNet-34.

[1]  S. Winograd Arithmetic complexity of computations , 1980 .

[2]  Yann LeCun,et al.  Fast Training of Convolutional Networks through FFTs , 2013, ICLR.

[3]  Andrew Lavin,et al.  Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[8]  Jason Cong,et al.  Minimizing Computation in Convolutional Neural Networks , 2014, ICANN.

[9]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.