Saliency Detection by Multiple-Instance Learning

Saliency detection has been a hot topic in recent years. Its popularity is mainly because of its theoretical meaning for explaining human attention and applicable aims in segmentation, recognition, etc. Nevertheless, traditional algorithms are mostly based on unsupervised techniques, which have limited learning ability. The obtained saliency map is also inconsistent with many properties of human behavior. In order to overcome the challenges of inability and inconsistency, this paper presents a framework based on multiple-instance learning. Low-, mid-, and high-level features are incorporated in the detection procedure, and the learning ability enables it robust to noise. Experiments on a data set containing 1000 images demonstrate the effectiveness of the proposed framework. Its applicability is shown in the context of a seam carving application.

[1]  HongJiang Zhang,et al.  Contrast-based image attention analysis by using fuzzy growing , 2003, MULTIMEDIA '03.

[2]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[3]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[4]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Sabine Süsstrunk,et al.  Salient Region Detection and Segmentation , 2008, ICVS.

[6]  Baoxin Li,et al.  A two-stage approach to saliency detection in images , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[8]  Bobby Bodenheimer,et al.  Synthesis and evaluation of linear motion transitions , 2008, TOGS.

[9]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[10]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[12]  Meng Wang,et al.  Image saliency: From intrinsic to extrinsic context , 2011, CVPR 2011.

[13]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[14]  Shi-Min Hu,et al.  Sketch2Photo: internet image montage , 2009, ACM Trans. Graph..

[15]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[17]  Paul A. Viola,et al.  Multiple Instance Boosting for Object Detection , 2005, NIPS.

[18]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[19]  J. Wolfe,et al.  What attributes guide the deployment of visual attention and how do they do it? , 2004, Nature Reviews Neuroscience.

[20]  R. Desimone,et al.  Neural mechanisms of selective visual attention. , 1995, Annual review of neuroscience.

[21]  Sally A. Goldman,et al.  Multiple-Instance Learning of Real-Valued Data , 2001, J. Mach. Learn. Res..

[22]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  O. Sorkine,et al.  Optimized scale-and-stretch for image resizing , 2008, SIGGRAPH 2008.

[24]  Ariel Shamir,et al.  Improved seam carving for video retargeting , 2008, SIGGRAPH 2008.

[25]  Xing Xie,et al.  A visual attention model for adapting images on small displays , 2003, Multimedia Systems.

[26]  Lihi Zelnik-Manor,et al.  Context-Aware Saliency Detection , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[28]  King Ngi Ngan,et al.  Unsupervised extraction of visual attention objects in color images , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[29]  C. Koch,et al.  Models of bottom-up and top-down visual attention , 2000 .

[30]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Xuelong Li,et al.  Discriminant Locally Linear Embedding With High-Order Tensor Data , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[32]  Xuelong Li,et al.  Enhanced Biologically Inspired Model for Object Recognition , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[33]  David L. Olson,et al.  Advanced Data Mining Techniques , 2008 .

[34]  Shang-Hong Lai,et al.  From co-saliency to co-segmentation: An efficient and fully unsupervised energy minimization model , 2011, CVPR 2011.

[35]  Ariel Shamir,et al.  Seam Carving for Content-Aware Image Resizing , 2007, ACM Trans. Graph..

[36]  I. Patras,et al.  Spatiotemporal salient points for visual recognition of human actions , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[37]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[38]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[39]  Christof Koch,et al.  Attentional Selection for Object Recognition - A Gentle Way , 2002, Biologically Motivated Computer Vision.

[40]  Long Quan,et al.  Image deblurring with blurred/noisy image pairs , 2007, SIGGRAPH 2007.

[41]  Mubarak Shah,et al.  Visual attention detection in video sequences using spatiotemporal cues , 2006, MM '06.

[42]  Byoung Chul Ko,et al.  Object-of-interest image segmentation based on human attention and semantic region clustering. , 2006, Journal of the Optical Society of America. A, Optics, image science, and vision.

[43]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[44]  Xuelong Li,et al.  Spatio-temporal salience based video quality assessment , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[45]  Xuelong Li,et al.  Biologically Inspired Features for Scene Classification in Video Surveillance , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[46]  Sabine Süsstrunk,et al.  Saliency detection using maximum symmetric surround , 2010, 2010 IEEE International Conference on Image Processing.

[47]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[48]  Jitendra Malik,et al.  Contour and Texture Analysis for Image Segmentation , 2001, International Journal of Computer Vision.

[49]  Douglas Lanman,et al.  BiDi screen: a thin, depth-sensing LCD for 3D interaction using light fields , 2009, SIGGRAPH 2009.

[50]  George K. I. Mann,et al.  An Object-Based Visual Attention Model for Robotic Applications , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[51]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.