论文信息 - Information Entropy Based Feature Pooling for Convolutional Neural Networks

Information Entropy Based Feature Pooling for Convolutional Neural Networks

In convolutional neural networks (CNNs), we propose to estimate the importance of a feature vector at a spatial location in the feature maps by the network's uncertainty on its class prediction, which can be quantified using the information entropy. Based on this idea, we propose the entropy-based feature weighting method for semantics-aware feature pooling which can be readily integrated into various CNN architectures for both training and inference. We demonstrate that such a location-adaptive feature weighting mechanism helps the network to concentrate on semantically important image regions, leading to improvements in the large-scale classification and weakly-supervised semantic segmentation tasks. Furthermore, the generated feature weights can be utilized in visual tasks such as weakly-supervised object localization. We conduct extensive experiments on different datasets and CNN architectures, outperforming recently proposed pooling methods and attention mechanisms in ImageNet classification as well as achieving state-of-the-arts in weakly-supervised semantic segmentation on PASCAL VOC 2012 dataset.

[1] Jian Sun,et al. ScribbleSup: Scribble-Supervised Convolutional Networks for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Xiaogang Wang,et al. Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Wenyu Liu,et al. Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Qilong Wang,et al. Is Second-Order Information Helpful for Large-Scale Visual Recognition? , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5] Seong Joon Oh,et al. Exploiting Saliency for Object Segmentation from Image Level Labels , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Philip H. S. Torr,et al. Discovering Class-Specific Pixels for Weakly-Supervised Semantic Segmentation , 2017, BMVC.

[8] Yan Wang,et al. SORT: Second-Order Response Transform for Visual Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9] Bolei Zhou,et al. Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Subhransu Maji,et al. Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11] Gang Sun,et al. Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12] Shuicheng Yan,et al. Sharing Residual Units Through Collective Tensor Factorization in Deep Neural Networks , 2017, ArXiv.

[13] George Papandreou,et al. Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14] Christoph H. Lampert,et al. Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation , 2016, ECCV.

[15] Jiaying Liu,et al. Factorized Bilinear Models for Image Recognition , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[16] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] Yunchao Wei,et al. Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi-Supervised Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18] Ming-Hsuan Yang,et al. DFT-based Transformation Invariant Pooling Layer for Visual Classification , 2018, ECCV.

[19] Jian Sun,et al. BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[21] Dahun Kim,et al. Two-Phase Learning for Weakly Supervised Object Localization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22] Fei-Fei Li,et al. What's the Point: Semantic Segmentation with Point Supervision , 2015, ECCV.

[23] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[24] Yunchao Wei,et al. STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25] Bolei Zhou,et al. Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26] Luc Van Gool,et al. The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[27] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28] Qilong Wang,et al. Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[30] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .

[31] Seunghoon Hong,et al. Weakly Supervised Semantic Segmentation Using Web-Crawled Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[34] Jingdong Wang,et al. Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[35] Zhengqin Li,et al. Automatic Image Cropping: A Computational Complexity Study , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Suha Kwak,et al. Learning Pixel-Level Semantic Affinity with Image-Level Supervision for Weakly Supervised Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.