论文信息 - Deep Saliency with Encoded Low Level Distance Map and High Level Features

Deep Saliency with Encoded Low Level Distance Map and High Level Features

Recent advances in saliency detection have utilized deep learning to obtain high level features to detect salient regions in a scene. These advances have demonstrated superior results over previous works that utilize hand-crafted low level features for saliency detection. In this paper, we demonstrate that hand-crafted features can provide complementary information to enhance performance of saliency detection that utilizes only high level features. Our method utilizes both high level and low level features for saliency detection under a unified deep learning framework. The high level features are extracted using the VGG-net, and the low level features are compared with other parts of an image to form a low level distance map. The low level distance map is then encoded using a convolutional neural network(CNN) with multiple 1 1 convolutional and ReLU layers. We concatenate the encoded low level distance map and the high level features, and connect them to a fully connected neural network classifier to evaluate the saliency of a query region. Our experiments show that our method can further improve the performance of state-of-the-art deep learning-based saliency detection methods.

[1] James M. Rehg,et al. The Secrets of Salient Object Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2] William E. Higgins,et al. Efficient Gabor filter design for texture segmentation , 1996, Pattern Recognit..

[3] Frédo Durand,et al. Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4] Huchuan Lu,et al. Deep networks for saliency detection via local estimation and global search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Xiaogang Wang,et al. Unsupervised Salience Learning for Person Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6] Pascal Fua,et al. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Huchuan Lu,et al. Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Nanning Zheng,et al. Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10] Ali Borji,et al. Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[11] Luc Van Gool,et al. The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[12] Jian Sun,et al. Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13] Jingdong Wang,et al. Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[14] Li Xu,et al. Hierarchical Saliency Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[16] Huchuan Lu,et al. Saliency Detection via Dense and Sparse Reconstruction , 2013, 2013 IEEE International Conference on Computer Vision.

[17] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Yizhou Yu,et al. Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Yu-Wing Tai,et al. Salient Region Detection via High-Dimensional Color Transform , 2014, CVPR.

[20] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[21] Denis Simakov,et al. Summarizing visual data using bidirectional similarity , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[23] Jitendra Malik,et al. Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] S. Süsstrunk,et al. Frequency-tuned salient region detection , 2009, CVPR 2009.

[25] Xiaogang Wang,et al. Switchable Deep Network for Pedestrian Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26] Ariel Shamir,et al. Seam Carving for Content-Aware Image Resizing , 2007, ACM Trans. Graph..

[27] Shi-Min Hu,et al. Global contrast based salient region detection , 2011, CVPR 2011.

[28] Xiaogang Wang,et al. Saliency detection by multi-context deep learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Shi-Min Hu,et al. SalientShape: group saliency in image collections , 2013, The Visual Computer.