Image salient object detection with refined deep features via convolution neural network

Abstract. Recent advances in saliency detection have used deep learning to obtain high-level features to detect salient regions. These advances have demonstrated superior results over previous works that use handcrafted low-level features for saliency detection. We propose a convolutional neural network (CNN) model to learn high-level features for saliency detection. Compared to other methods, our method presents two merits. First, when performing features extraction, apart from the convolution and pooling step in our method, we add restricted Boltzmann machine into the CNN framework to obtain more accurate features in intermediate step. Second, in order to avoid manual annotation data, we add deep belief network classifier at the end of this model to classify salient and nonsalient regions. Quantitative and qualitative experiments on three benchmark datasets demonstrate that our method performs favorably against the state-of-the-art methods.

[1]  Huchuan Lu,et al.  Saliency detection via background and foreground seed selection , 2015, Neurocomputing.

[2]  Huchuan Lu,et al.  Deep networks for saliency detection via local estimation and global search , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Wojciech Matusik,et al.  Eye Tracking for Everyone , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  D. Hubel,et al.  Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.

[5]  Hongmei Shao,et al.  A New BP Algorithm with Adaptive Momentum for FNNs Training , 2009, 2009 WRI Global Congress on Intelligent Systems.

[6]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Li Xu,et al.  Hierarchical Saliency Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Giovanni Maria Farinella,et al.  Saliency-Based Selection of Gradient Vector Flow Paths for Content Aware Image Resizing , 2014, IEEE Transactions on Image Processing.

[9]  Ching Y. Suen,et al.  A novel hybrid CNN-SVM classifier for recognizing handwritten digits , 2012, Pattern Recognit..

[10]  Huchuan Lu,et al.  Salient object detection via global and local cues , 2015, Pattern Recognit..

[11]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Srinivas S. Kruthiventi,et al.  Saliency Unified: A Deep Architecture for simultaneous Eye Fixation Prediction and Salient Object Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Qi Zhao,et al.  Learning to predict eye fixations for semantic contents using multi-layer sparse network , 2014, Neurocomputing.

[14]  C. Koch,et al.  A saliency-based search mechanism for overt and covert shifts of visual attention , 2000, Vision Research.

[15]  Lihi Zelnik-Manor,et al.  Context-Aware Saliency Detection , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Xiaogang Wang,et al.  Saliency detection by multi-context deep learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Xavier Giró-i-Nieto,et al.  End-to-end Convolutional Network for Saliency Prediction , 2015, ArXiv.

[20]  Diane M. Beck,et al.  Top-down and bottom-up mechanisms in biasing competition in the human brain , 2009, Vision Research.

[21]  Noel E. O'Connor,et al.  Shallow and Deep Convolutional Networks for Saliency Prediction , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Stephen P. Boyd,et al.  Network Lasso: Clustering and Optimization in Large Graphs , 2015, KDD.

[23]  Christof Koch,et al.  Predicting human gaze using low-level saliency combined with face detection , 2007, NIPS.

[24]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[25]  Chokri Ben Amar,et al.  Prediction of visual attention with deep CNN on artificially degraded videos for studies of attention of patients with Dementia , 2017, Multimedia Tools and Applications.

[26]  Mudar Sarem,et al.  Saliency modeling via outlier detection , 2014, J. Electronic Imaging.

[27]  Christof Koch,et al.  Feature combination strategies for saliency-based visual attention systems , 2001, J. Electronic Imaging.

[28]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[29]  Ying Wu,et al.  A unified approach to salient object detection via low rank matrix recovery , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Graham W. Taylor,et al.  Adaptive deconvolutional networks for mid and high level feature learning , 2011, 2011 International Conference on Computer Vision.

[31]  Xuelong Li,et al.  DISC: Deep Image Saliency Computing via Progressive Representation Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[33]  Jianhua Wang,et al.  Exploiting multiple contexts for saliency detection , 2016, J. Electronic Imaging.

[34]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[35]  Jingdong Wang,et al.  Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[36]  R. Venkatesh Babu,et al.  DeepFix: A Fully Convolutional Neural Network for Predicting Human Eye Fixations , 2015, IEEE Transactions on Image Processing.

[37]  Chokri Ben Amar,et al.  Deep Learning for Saliency Prediction in Natural Video , 2016, ArXiv.

[38]  James M. Rehg,et al.  The Secrets of Salient Object Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[40]  Huchuan Lu,et al.  Fixation prediction with a combined model of bottom-up saliency and vanishing point , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[41]  Gayoung Lee,et al.  Deep Saliency with Encoded Low Level Distance Map and High Level Features , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[43]  Huchuan Lu,et al.  Salient object detection via bootstrap learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Giovanni Maria Farinella,et al.  An Experimental Analysis of Saliency Detection with Respect to Three Saliency Levels , 2014, ECCV Workshops.

[45]  Yizhou Yu,et al.  Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Manoranjan Paul,et al.  An Analysis of Human Engagement Behaviour Using Descriptors from Human Feedback, Eye Tracking, and Saliency Modelling , 2015, 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[47]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[48]  Matthias Bethge,et al.  Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet , 2014, ICLR.

[49]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.