Recurrent Attentional Networks for Saliency Detection

Convolutional-deconvolution networks can be adopted to perform end-to-end saliency detection. But, they do not work well with objects of multiple scales. To overcome such a limitation, in this work, we propose a recurrent attentional convolutional-deconvolution network (RACDNN). Using spatial transformer and recurrent network units, RACDNN is able to iteratively attend to selected image sub-regions to perform saliency refinement progressively. Besides tackling the scale problem, RACDNN can also learn context-aware features from past iterations to enhance saliency refinement in future iterations. Experiments on several challenging saliency detection datasets validate the effectiveness of RACDNN, and show that RACDNN outperforms state-of-the-art saliency detection methods.

[1]  Yizhou Yu,et al.  Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Huchuan Lu,et al.  Saliency detection via Cellular Automata , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[4]  David Dagan Feng,et al.  Robust saliency detection via regularized random walks ranking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[6]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[7]  Huchuan Lu,et al.  Saliency Detection via Dense and Sparse Reconstruction , 2013, 2013 IEEE International Conference on Computer Vision.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Rongrong Ji,et al.  RGBD Salient Object Detection: A Benchmark and Algorithms , 2014, ECCV.

[11]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[13]  Lihi Zelnik-Manor,et al.  What Makes a Patch Distinct? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[15]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Chao Li,et al.  Co-saliency detection via looking deep and wide , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Ting Liu,et al.  Recent advances in convolutional neural networks , 2015, Pattern Recognit..

[18]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[19]  Matthieu Guillaumin,et al.  ImageNet Auto-Annotation with Segmentation Propagation , 2014, International Journal of Computer Vision.

[20]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[23]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[24]  Huchuan Lu,et al.  Deep networks for saliency detection via local estimation and global search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Ronan Collobert,et al.  Recurrent Convolutional Neural Networks for Scene Labeling , 2014, ICML.

[26]  Michael Dorr,et al.  Large-Scale Optimization of Hierarchical Features for Saliency Prediction in Natural Images , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Ronen Basri,et al.  Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Ali Borji,et al.  Adaptive object tracking by learning background context , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[29]  Koray Kavukcuoglu,et al.  Multiple Object Recognition with Visual Attention , 2014, ICLR.

[30]  Tianming Liu,et al.  Predicting eye fixations using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Xiaogang Wang,et al.  Saliency detection by multi-context deep learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Shi-Min Hu,et al.  Sketch2Photo: internet image montage , 2009, ACM Trans. Graph..

[33]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Li Xu,et al.  Hierarchical Image Saliency Detection on Extended CSSD , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Yang Liu,et al.  Depth-aware salient object detection using anisotropic center-surround difference , 2015, Signal Process. Image Commun..

[36]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[37]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[39]  Sabine Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[40]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[41]  Huchuan Lu,et al.  Saliency Detection via Absorbing Markov Chain , 2013, 2013 IEEE International Conference on Computer Vision.

[42]  Shi-Min Hu,et al.  SalientShape: group saliency in image collections , 2013, The Visual Computer.

[43]  Ming-Hsuan Yang,et al.  Top-down visual saliency via joint CRF and dictionary learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[45]  Markus Vincze,et al.  Saliency-based object discovery on RGB-D data with a late-fusion approach , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[46]  Simone Frintrop,et al.  Traditional saliency reloaded: A good old model in new shape , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Jingdong Wang,et al.  Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[48]  N. Priyadharshini,et al.  Region-Based Saliency Detection and its Application in Object Recognition , 2016 .

[49]  Jian Sun,et al.  Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.