Progressive Attention Guided Recurrent Network for Salient Object Detection

Effective convolutional features play an important role in saliency estimation but how to learn powerful features for saliency is still a challenging task. FCN-based methods directly apply multi-level convolutional features without distinction, which leads to sub-optimal results due to the distraction from redundant details. In this paper, we propose a novel attention guided network which selectively integrates multi-level contextual information in a progressive manner. Attentive features generated by our network can alleviate distraction of background thus achieve better performance. On the other hand, it is observed that most of existing algorithms conduct salient object detection by exploiting side-output features of the backbone feature extraction network. However, shallower layers of backbone network lack the ability to obtain global semantic information, which limits the effective feature learning. To address the problem, we introduce multi-path recurrent feedback to enhance our proposed progressive attention driven framework. Through multi-path recurrent connections, global semantic information from the top convolutional layer is transferred to shallower layers, which intrinsically refines the entire network. Experimental results on six benchmark datasets demonstrate that our algorithm performs favorably against the state-of-the-art approaches.

[1]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[2]  Yael Pritch,et al.  Saliency filters: Contrast based filtering for salient region detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Huchuan Lu,et al.  Saliency Detection via Dense and Sparse Reconstruction , 2013, 2013 IEEE International Conference on Computer Vision.

[4]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Li Xu,et al.  Hierarchical Saliency Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Shi-Min Hu,et al.  SalientShape: group saliency in image collections , 2013, The Visual Computer.

[7]  Jingdong Wang,et al.  Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[8]  James M. Rehg,et al.  The Secrets of Salient Object Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[10]  Jian Sun,et al.  Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Huchuan Lu,et al.  Saliency detection via Cellular Automata , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Yizhou Yu,et al.  Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Huchuan Lu,et al.  Deep networks for saliency detection via local estimation and global search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[15]  Huchuan Lu,et al.  Salient object detection via bootstrap learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[17]  Xiaogang Wang,et al.  Saliency detection by multi-context deep learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Kate Saenko,et al.  Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering , 2015, ECCV.

[19]  Alexander J. Smola,et al.  Stacked Attention Networks for Image Question Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yizhou Yu,et al.  Deep Contrast Learning for Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Yueting Zhuang,et al.  DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection , 2015, IEEE Transactions on Image Processing.

[22]  Huchuan Lu,et al.  Kernelized Subspace Ranking for Saliency Detection , 2016, ECCV.

[23]  Huchuan Lu,et al.  Saliency Detection with Recurrent Fully Convolutional Networks , 2016, ECCV.

[24]  Gang Wang,et al.  Recurrent Attentional Networks for Saliency Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Joachim Denzler,et al.  ImageNet pre-trained models with batch normalization , 2016, ArXiv.

[26]  Gayoung Lee,et al.  Deep Saliency with Encoded Low Level Distance Map and High Level Features , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Junwei Han,et al.  DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Huchuan Lu,et al.  Learning Uncertain Convolutional Features for Accurate Saliency Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Tao Li,et al.  Structure-Measure: A New Way to Evaluate Foreground Maps , 2017, International Journal of Computer Vision.

[31]  Zhuowen Tu,et al.  Deeply Supervised Salient Object Detection with Short Connections , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Huchuan Lu,et al.  Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Tat-Seng Chua,et al.  SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Huchuan Lu,et al.  A Stagewise Refinement Model for Detecting Salient Objects in Images , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Xiaogang Wang,et al.  Multi-context Attention for Human Pose Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Huchuan Lu,et al.  Learning to Detect Salient Objects with Image-Level Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Shuicheng Yan,et al.  Multi-Path Feedback Recurrent Neural Networks for Scene Parsing , 2016, AAAI.

[39]  Ming-Hsuan Yang,et al.  DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection , 2015, IEEE Transactions on Image Processing.