Region-Based Saliency Detection and Its Application in Object Recognition

The objective of this paper is twofold. First, we introduce an effective region-based solution for saliency detection. Then, we apply the achieved saliency map to better encode the image features for solving object recognition task. To find the perceptually and semantically meaningful salient regions, we extract superpixels based on an adaptive mean shift algorithm as the basic elements for saliency detection. The saliency of each superpixel is measured by using its spatial compactness, which is calculated according to the results of Gaussian mixture model (GMM) clustering. To propagate saliency between similar clusters, we adopt a modified PageRank algorithm to refine the saliency map. Our method not only improves saliency detection through large salient region detection and noise tolerance in messy background, but also generates saliency maps with a well-defined object shape. Experimental results demonstrate the effectiveness of our method. Since the objects usually correspond to salient regions, and these regions usually play more important roles for object recognition than background, we apply our achieved saliency map for object recognition by incorporating a saliency map into sparse coding-based spatial pyramid matching (ScSPM) image representation. To learn a more discriminative codebook and better encode the features corresponding to the patches of the objects, we propose a weighted sparse coding for feature coding. Moreover, we also propose a saliency weighted max pooling to further emphasize the importance of those salient regions in feature pooling module. Experimental results on several datasets illustrate that our weighted ScSPM framework greatly outperforms ScSPM framework, and achieves excellent performance for object recognition.

[1]  Ali Borji,et al.  Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study , 2013, IEEE Transactions on Image Processing.

[2]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  A. Torralba,et al.  The role of context in object recognition , 2007, Trends in Cognitive Sciences.

[4]  Jitendra Malik,et al.  An Information Maximization Model of Eye Movements , 2004, NIPS.

[5]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[6]  Nuno Vasconcelos,et al.  Discriminant Saliency for Visual Recognition from Cluttered Scenes , 2004, NIPS.

[7]  Shao-Yi Chien,et al.  Automatic object segmentation with salient color model , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[8]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[9]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Matthew H Tong,et al.  SUN: Top-down saliency using natural statistics , 2009, Visual cognition.

[12]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[13]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[16]  Deepu Rajan,et al.  Salient Region Detection by Modeling Distributions of Color and Orientation , 2009, IEEE Transactions on Multimedia.

[17]  Antonio Torralba,et al.  Top-down control of visual attention in object detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[18]  Peter Meer,et al.  Synergism in low level vision , 2002, Object recognition supported by user interaction for service robots.

[19]  Alexei A. Efros,et al.  An empirical study of context in object detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Joachim Hertzberg,et al.  Saliency-based object recognition in 3D data , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[21]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[22]  Jiebo Luo,et al.  iCoseg: Interactive co-segmentation with intelligent scribble guidance , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Tiejun Huang,et al.  Automatic interesting object extraction from images using complementary saliency maps , 2010, ACM Multimedia.

[24]  Liang-Tien Chia,et al.  Multi-layer group sparse coding — For concurrent image classification and annotation , 2011, CVPR 2011.

[25]  Yin Li,et al.  Visual Saliency Based on Conditional Entropy , 2009, ACCV.

[26]  Frédéric Jurie,et al.  Learning Saliency Maps for Object Categorization , 2006 .

[27]  Kimura Akisato,et al.  Saliency-based video segmentation with graph cuts and sequentially updated priors , 2009 .

[28]  Nuno Vasconcelos,et al.  The discriminant center-surround hypothesis for bottom-up saliency , 2007, NIPS.

[29]  Liming Zhang,et al.  Spatio-temporal Saliency detection using phase spectrum of quaternion fourier transform , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Laurent Itti,et al.  Interesting objects are visually salient. , 2008, Journal of vision.

[31]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[32]  Ariel Shamir,et al.  Seam Carving for Content-Aware Image Resizing , 2007, ACM Trans. Graph..

[33]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[34]  Nanning Zheng,et al.  Automatic salient object segmentation based on context and shape prior , 2011, BMVC.

[35]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[36]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[37]  Deepu Rajan,et al.  Random walks on graphs to model saliency in images , 2009, CVPR.

[38]  Liang-Tien Chia,et al.  Laplacian Sparse Coding, Hypergraph Laplacian Sparse Coding, and Applications , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[40]  Henrik I. Christensen,et al.  Object detection using background context , 2004, ICPR 2004.

[41]  Christopher K. I. Williams,et al.  Pascal Visual Object Classes Challenge Results , 2005 .

[42]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[43]  Xian-Sheng Hua,et al.  Typicality ranking via semi-supervised multiple-instance learning , 2007, ACM Multimedia.

[44]  N. Vasconcelos,et al.  Biologically plausible saliency mechanisms improve feedforward object recognition , 2010, Vision Research.

[45]  Gabriela Csurka,et al.  A framework for visual saliency detection with applications to image thumbnailing , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[46]  Pietro Perona,et al.  On the usefulness of attention for object recognition , 2004 .

[47]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[49]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Tie-Yan Liu,et al.  Semi-supervised ranking on very large graphs with rich metadata , 2011, KDD.

[51]  P. Perona,et al.  Objects predict fixations better than early saliency. , 2008, Journal of vision.

[52]  Wen Gao,et al.  Measuring visual saliency by Site Entropy Rate , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[53]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[54]  Shuicheng Yan,et al.  Visual classification with multi-task joint sparse representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[55]  Andrew Zisserman,et al.  An Exemplar Model for Learning Object Classes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Liang-Tien Chia,et al.  Sparse Representation With Kernels , 2013, IEEE Transactions on Image Processing.

[57]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[58]  Lihi Zelnik-Manor,et al.  Context-Aware Saliency Detection , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[59]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[60]  Liang-Tien Chia,et al.  Improved saliency detection based on superpixel clustering and saliency propagation , 2010, ACM Multimedia.

[61]  Qi Tian,et al.  Saliency Density Maximization for Efficient Visual Objects Discovery , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[62]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[63]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[64]  Ali Shokoufandeh,et al.  View-based object recognition using saliency maps , 1999, Image Vis. Comput..

[65]  Hanqing Lu,et al.  Saliency Cuts: An automatic approach to object segmentation , 2008, 2008 19th International Conference on Pattern Recognition.

[66]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[67]  Peyman Milanfar,et al.  Static and space-time visual saliency detection by self-resemblance. , 2009, Journal of vision.

[68]  Nuno Vasconcelos,et al.  Integrated learning of saliency, complex features, and object detectors from cluttered scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).