Attentiondrop for Convolutional Neural Networks

Dropout has been widely used in fully connected networks but becomes less effective for convolutional neural networks (CNNs), since the spatially correlated features still allow dropped information to flow through the network. To make dropout more practical for CNNs, structured dropout methods have been recently proposed by dropping regions with fixed shapes and random positions, which nonetheless may lead to unexpected discarding of information. To address this problem, in this paper, we propose a novel dropout variant based on attention information named AttentionDrop that drops features adaptively. Specifically, it precisely localizes masks that have irregular shapes according to the values of activation units. In addition, the use of soft values in adaptive masks lowers the risk of a complete loss of indispensable information. Experimental results demonstrate the effectiveness of our AttentionDrop on public datasets for image classification.

[1]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Jonathan Tompson,et al.  Efficient object localization using Convolutional Networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[4]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[6]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[7]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Xavier Gastaldi,et al.  Shake-Shake regularization , 2017, ArXiv.

[9]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[10]  Quoc V. Le,et al.  DropBlock: A regularization method for convolutional networks , 2018, NeurIPS.

[11]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[12]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[16]  Masakazu Iwamura,et al.  ShakeDrop regularization , 2018, ICLR.

[17]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[18]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).