Efficient saliency detection using convolutional neural networks with feature selection

Abstract Saliency detection is a fundamental problem in computer vision tasks. With the advent of convolutional neural networks (CNNs), computational models for salient object detection have evolved from relying on handcrafted features to high-level, deep contrast information. However, existing approaches to saliency detection rarely conduct an in-depth analysis of CNN features. This study discovers that a promising saliency map can be learned from low-, middle-, and high-layer feature maps. Moreover, only certain selected feature maps can facilitate improvement of the results. Those maps exhibiting significant similarities with ground truths contain increased contrast information and should be selected. These discoveries have motivated the design of our saliency detection system. For unknown ground truths, we construct a deep CNN to obtain the approximate salient area as the ground truth substitute for feature map selection. Thereafter, a CNN architecture with three convolutional layers is constructed on top of all selected feature maps to learn their deep contrast information, directly yielding improved saliency maps. The experimental results demonstrate that the proposed method significantly outperforms competing approaches on three widely used datasets.

[1]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[2]  Xiaogang Wang,et al.  Saliency detection by multi-context deep learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[5]  Nikos Komodakis,et al.  HARF: Hierarchy-Associated Rich Features for Salient Object Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Liming Zhang,et al.  Spatio-temporal Saliency detection using phase spectrum of quaternion fourier transform , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Simone Frintrop,et al.  Center-surround divergence of feature statistics for salient object detection , 2011, 2011 International Conference on Computer Vision.

[8]  Tara N. Sainath,et al.  Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  P. König,et al.  Does luminance‐contrast contribute to a saliency map for overt visual attention? , 2003, The European journal of neuroscience.

[10]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[11]  Xiaogang Wang,et al.  Visual Tracking with Fully Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  S. Avidan,et al.  Seam carving for content-aware image resizing , 2007, SIGGRAPH 2007.

[14]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Yueting Zhuang,et al.  DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection , 2015, IEEE Transactions on Image Processing.

[16]  Gabriela Csurka,et al.  A framework for visual saliency detection with applications to image thumbnailing , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[17]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yizhou Yu,et al.  Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[20]  Sabine Süsstrunk,et al.  Salient Region Detection and Segmentation , 2008, ICVS.

[21]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[22]  Li Xu,et al.  Hierarchical Saliency Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Farzin Mokhtarian,et al.  Robust Image Corner Detection Through Curvature Scale Space , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Huchuan Lu,et al.  Deep networks for saliency detection via local estimation and global search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[26]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[27]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[29]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[30]  Yael Pritch,et al.  Saliency filters: Contrast based filtering for salient region detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[32]  Sabine Süsstrunk,et al.  Saliency detection for content-aware image resizing , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[33]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[34]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[36]  Allen R. Hanson,et al.  Extracting Straight Lines , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Derrick J. Parkhurst,et al.  Modeling the role of salience in the allocation of overt visual attention , 2002, Vision Research.

[38]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[39]  Yizhou Yu,et al.  Person Re-Identification Using Multiple Experts with Random Subspaces , 2014 .

[40]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[41]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[42]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Yu Hu,et al.  Learning to Detect Saliency with Deep Structure , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[44]  Yizhou Yu,et al.  SCaLE: Supervised and Cascaded Laplacian Eigenmaps for Visual Object Recognition Based on Nearest Neighbors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[46]  Rynson W. H. Lau,et al.  SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection , 2015, International Journal of Computer Vision.

[47]  Nanning Zheng,et al.  Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.