MASA-Net: Multi-level Aggregated and Self-attentive Network for Fast Salient Object Detection

With the rapid development of convolutional neural network, salient object detection has achieved satisfied visual performance. However, most previous method chooses to focus on either accuracy or speed. We propose a multi-level aggregated and self-attentive Network for fast salient object detection with a great improvement and balance of both metrics. Accordingly, we solve the scale variation problem by adopting multi-level aggregated modules which aggregate the neighboring features and enhance the feature in different resolutions. To enhance the representative feature in each layer, a self-attentive module with astrous convolution is proposed in decoder stage. In order to maintain object consistency and boundary clarity, a multi-level fusion loss which combine the first layer's boundary loss and five-layer binary cross-entropy loss is proposed. Experimental results on five widely used datasets demonstrate that our proposed method performs favorably against the state-of-the-art methods in both accuracy and speed without any pre-processing and post-processing.

[1]  T. Koshikawa Object recognition system , 1979 .

[2]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Li Xu,et al.  Hierarchical Saliency Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Jingdong Wang,et al.  Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[5]  James M. Rehg,et al.  The Secrets of Salient Object Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Yizhou Yu,et al.  Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[8]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[9]  Radomír Mech,et al.  Minimum Barrier Salient Object Detection at 80 FPS , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[11]  Tianqi Chen,et al.  Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  Srinivas S. Kruthiventi,et al.  Saliency Unified: A Deep Architecture for simultaneous Eye Fixation Prediction and Salient Object Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yizhou Yu,et al.  Deep Contrast Learning for Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Huchuan Lu,et al.  Learning Uncertain Convolutional Features for Accurate Saliency Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Tao Li,et al.  Structure-Measure: A New Way to Evaluate Foreground Maps , 2017, International Journal of Computer Vision.

[19]  Zhuowen Tu,et al.  Deeply Supervised Salient Object Detection with Short Connections , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Huchuan Lu,et al.  Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Huchuan Lu,et al.  Learning to Detect Salient Objects with Image-Level Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Zhiming Luo,et al.  Non-local Deep Features for Salient Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[24]  Huchuan Lu,et al.  Detect Globally, Refine Locally: A Novel Approach to Saliency Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Gang Wang,et al.  Progressive Attention Guided Recurrent Network for Salient Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Ming-Hsuan Yang,et al.  PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Dinggang Shen,et al.  Contour Knowledge Transfer for Salient Object Detection , 2018, ECCV.

[28]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[29]  Ruigang Yang,et al.  Saliency-Aware Video Object Segmentation , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Yunhong Wang,et al.  Receptive Field Block Net for Accurate and Fast Object Detection , 2017, ECCV.

[32]  Ming-Ming Cheng,et al.  EGNet: Edge Guidance Network for Salient Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Huchuan Lu,et al.  A Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Tingfa Xu,et al.  SVTN: Siamese Visual Tracking Networks With Spatially Constrained Correlation Filter and Saliency Prior Context Model , 2019, IEEE Access.

[35]  Qingming Huang,et al.  Stacked Cross Refinement Network for Edge-Aware Salient Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Zhe Wu,et al.  Cascaded Partial Decoder for Fast and Accurate Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Huchuan Lu,et al.  CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Chao Gao,et al.  BASNet: Boundary-Aware Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Ling Shao,et al.  An Iterative and Cooperative Top-Down and Bottom-Up Inference Network for Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Huchuan Lu,et al.  Attentive Feedback Network for Boundary-Aware Salient Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).