Retargeting Semantically-Rich Photos

Semantically-rich photos contain a rich variety of semantic objects (e.g., pedestrians and bicycles). Retargeting these photos is a challenging task since each semantic object has fixed geometric characteristics. Shrinking these objects simultaneously during retargeting is prone to distortion. In this paper, we propose to retarget semantically-rich photos by detecting photo semantics from image tags, which are predicted by a multi-label SVM. The key technique is a generative model termed latent stability discovery (LSD). It can robustly localize various semantic objects in a photo by making use of the predicted noisy image tags. Based on LSD, a feature fusion algorithm is proposed to detect salient regions at both the low-level and high-level. These salient regions are linked into a path sequentially to simulate human visual perception . Finally, we learn the prior distribution of such paths from aesthetically pleasing training photos. The prior enforces the path of a retargeted photo to be maximally similar to those from the training photos. In the experiment, we collect 217 1600 ×1200 photos, each containing over seven salient objects. Comprehensive user studies demonstrate the competitiveness of our method.

[1]  Yoichi Sato,et al.  Sensation-based photo cropping , 2009, ACM Multimedia.

[2]  W. Chu Studying Aesthetics in Photographic Images Using a Computational Approach , 2013 .

[3]  Yong Jae Lee,et al.  Object-graphs for context-aware category discovery , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Olga Sorkine-Hornung,et al.  A comparative study of image retargeting , 2010, ACM Trans. Graph..

[5]  O. Sorkine,et al.  Optimized scale-and-stretch for image resizing , 2008, SIGGRAPH 2008.

[6]  Lihi Zelnik-Manor,et al.  Context-Aware Saliency Detection , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Ling-Yu Duan,et al.  Consumer video retargeting: context assisted spatial-temporal grid optimization , 2009, ACM Multimedia.

[8]  Yi Yang,et al.  Weakly Supervised Photo Cropping , 2014, IEEE Transactions on Multimedia.

[9]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[10]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[11]  Frédo Durand,et al.  A Benchmark of Computational Models of Saliency to Predict Human Fixations , 2012 .

[12]  Jian Shi,et al.  Image Retargeting Using Mesh Parametrization , 2009, IEEE Transactions on Multimedia.

[13]  Ariel Shamir,et al.  Improved seam carving for video retargeting , 2008, SIGGRAPH 2008.

[14]  Dacheng Tao,et al.  Subspaces Indexing Model on Grassmann Manifold for Image Search , 2011, IEEE Transactions on Image Processing.

[15]  Li Xu,et al.  Hierarchical Saliency Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Haibin Ling,et al.  Scale and Object Aware Image Thumbnailing , 2013, International Journal of Computer Vision.

[17]  Ming-Hsuan Yang,et al.  Top-down visual saliency via joint CRF and dictionary learning , 2012, CVPR.

[18]  Gangshan Wu,et al.  Automatic image retargeting evaluation based on user perception , 2010, 2010 IEEE International Conference on Image Processing.

[19]  Gabriela Csurka,et al.  Assessing the aesthetic quality of photographs using generic image descriptors , 2011, 2011 International Conference on Computer Vision.

[20]  Yael Pritch,et al.  Shift-map image editing , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[21]  Yong-Jin Liu,et al.  Image Retargeting Quality Assessment , 2011, Comput. Graph. Forum.

[22]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[23]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[24]  Ariel Shamir,et al.  Cropping Scaling Seam carving Warping Multi-operator , 2009 .

[25]  Tong-Yee Lee,et al.  Motion-based video retargeting with optimized crop-and-warp , 2010, SIGGRAPH 2010.

[26]  Xiao Liu,et al.  Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Martin D. Levine,et al.  Visual Saliency Based on Scale-Space Analysis in the Frequency Domain , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Markus A. Stricker,et al.  Similarity of color images , 1995, Electronic Imaging.

[29]  Christof Koch,et al.  Image Signature: Highlighting Sparse Salient Regions , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[31]  Antonio Torralba,et al.  Nonparametric scene parsing: Label transfer via dense scene alignment , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Vicente Ordonez,et al.  High level describable attributes for predicting aesthetics and interestingness , 2011, CVPR 2011.

[33]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Michael Dorr,et al.  Large-Scale Optimization of Hierarchical Features for Saliency Prediction in Natural Images , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Daniel Cohen-Or,et al.  Non-homogeneous Content-driven Video-retargeting , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[36]  Xuelong Li,et al.  Actively Learning Human Gaze Shifting Paths for Semantics-Aware Photo Cropping , 2014, IEEE Transactions on Image Processing.

[37]  Bu-Sung Lee,et al.  Bottom-Up Saliency Detection Model Based on Human Visual Sensitivity and Amplitude Spectrum , 2012, IEEE Transactions on Multimedia.

[38]  Frédo Durand,et al.  Learning to predict where humans look , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[39]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2014, Computational Visual Media.

[40]  Yue Gao,et al.  Representative Discovery of Structure Cues for Weakly-Supervised Image Segmentation , 2014, IEEE Transactions on Multimedia.

[41]  Wei Luo,et al.  Content-Based Photo Quality Assessment , 2013, IEEE Trans. Multim..

[42]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[43]  Ariel Shamir,et al.  Seam Carving for Content-Aware Image Resizing , 2007, ACM Trans. Graph..

[44]  Natasha Gelfand,et al.  A survey of image retargeting techniques , 2010, Optical Engineering + Applications.

[45]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[46]  M. Kendall,et al.  ON THE METHOD OF PAIRED COMPARISONS , 1940 .

[47]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[48]  Stan Sclaroff,et al.  Saliency Detection: A Boolean Map Approach , 2013, 2013 IEEE International Conference on Computer Vision.

[49]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[50]  Weisi Lin,et al.  A Saliency Detection Model Using Low-Level Features Based on Wavelet Transform , 2013, IEEE Transactions on Multimedia.

[51]  Christof Koch,et al.  Learning visual saliency by combining feature maps in a nonlinear manner using AdaBoost. , 2012, Journal of vision.

[52]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[53]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[54]  Jingdong Wang,et al.  Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[55]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[56]  Ralph R. Martin,et al.  Shrinkability Maps for Content‐Aware Video Resizing , 2008, Comput. Graph. Forum.

[57]  Qi Tian,et al.  Perception-Guided Multimodal Feature Fusion for Photo Aesthetics Assessment , 2014, ACM Multimedia.

[58]  Masashi Nishiyama,et al.  Aesthetic quality classification of photographs based on color harmony , 2011, CVPR 2011.

[59]  Bingbing Ni,et al.  Learning to photograph , 2010, ACM Multimedia.

[60]  Yan Liu,et al.  Image retargeting using multi-map constrained region warping , 2009, ACM Multimedia.

[61]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[62]  Chao-Hung Lin,et al.  Patch-Based Image Warping for Content-Aware Retargeting , 2013, IEEE Transactions on Multimedia.

[63]  Yan Ke,et al.  The Design of High-Level Features for Photo Quality Assessment , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).