Perceptually Aware Image Retargeting for Mobile Devices

Retargeting aims at adapting an original high-resolution photograph/video to a low-resolution screen with an arbitrary aspect ratio. Conventional approaches are generally based on desktop PCs, since the computation might be intolerable for mobile platforms (especially when retargeting videos). Typically, only low-level visual features are exploited, and human visual perception is not well encoded. In this paper, we propose a novel retargeting framework that rapidly shrinks a photograph/video by leveraging human gaze behavior. Specifically, we first derive a geometry-preserving graph ranking algorithm, which efficiently selects a few salient object patches to mimic the human gaze shifting path (GSP) when viewing a scene. Afterward, an aggregation-based CNN is developed to hierarchically learn the deep representation for each GSP. Based on this, a probabilistic model is developed to learn the priors of the training photographs that are marked as aesthetically pleasing by professional photographers. We utilize the learned priors to efficiently shrink the corresponding GSP of a retargeted photograph/video to maximize its similarity to those from the training photographs. Extensive experiments have demonstrated that: 1) our method requires less than 35 ms to retarget a $1024\times 768$ photograph (or a $1280\times 720$ video frame) on popular iOS/Android devices, which is orders of magnitude faster than the conventional retargeting algorithms; 2) the retargeted photographs/videos produced by our method significantly outperform those of its competitors based on a paired-comparison-based user study; and 3) the learned GSPs are highly indicative of human visual attention according to the human eye tracking experiments.

[1]  Thomas Deselaers,et al.  What is an object? , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Frédo Durand,et al.  A Benchmark of Computational Models of Saliency to Predict Human Fixations , 2012 .

[4]  Naila Murray,et al.  Discovering Beautiful Attributes for Aesthetic Image Analysis , 2014, International Journal of Computer Vision.

[5]  Ariel Shamir,et al.  Seam Carving for Content-Aware Image Resizing , 2007, ACM Trans. Graph..

[6]  Ling Shao,et al.  Perceptually Guided Photo Retargeting , 2017, IEEE Transactions on Cybernetics.

[7]  Natasha Gelfand,et al.  A survey of image retargeting techniques , 2010, Optical Engineering + Applications.

[8]  Jian Yu,et al.  Saliency Detection by Multitask Sparsity Pursuit , 2012, IEEE Transactions on Image Processing.

[9]  Markus H. Gross,et al.  A system for retargeting of streaming video , 2009, ACM Trans. Graph..

[10]  Gustavo Carneiro,et al.  Weakly Supervised Top-down Image Segmentation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Vicente Ordonez,et al.  High level describable attributes for predicting aesthetics and interestingness , 2011, CVPR 2011.

[12]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[13]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[14]  Xiao Liu,et al.  Probabilistic Graphlet Transfer for Photo Cropping , 2013, IEEE Transactions on Image Processing.

[15]  Yao Hu,et al.  Active learning via neighborhood reconstruction , 2013, IJCAI 2013.

[16]  James Zijun Wang,et al.  RAPID: Rating Pictorial Aesthetics using Deep Learning , 2014, ACM Multimedia.

[17]  Radomír Mech,et al.  Deep Multi-patch Aggregation Network for Image Style, Aesthetics, and Quality Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Katerina Pastra,et al.  COSMOROE: a cross-media relations framework for modelling multimedia dialectics , 2008, Multimedia Systems.

[19]  Ming Ouhyoung,et al.  Personalized photograph ranking and selection system , 2010, ACM Multimedia.

[20]  Bo Yan,et al.  Matching-Area-Based Seam Carving for Video Retargeting , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[21]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[23]  O. Sorkine,et al.  Optimized scale-and-stretch for image resizing , 2008, SIGGRAPH 2008.

[24]  Tong-Yee Lee,et al.  Scalable and coherent video resizing with per-frame optimization , 2011, SIGGRAPH 2011.

[25]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[26]  Jian Shi,et al.  Image Retargeting Using Mesh Parametrization , 2009, IEEE Transactions on Multimedia.

[27]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Daniel Cohen-Or,et al.  Non-homogeneous Content-driven Video-retargeting , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[29]  Haibin Ling,et al.  Scale and Object Aware Image Thumbnailing , 2013, International Journal of Computer Vision.

[30]  Masashi Nishiyama,et al.  Aesthetic quality classification of photographs based on color harmony , 2011, CVPR 2011.

[31]  John K. Tsotsos,et al.  Saliency, attention, and visual search: an information theoretic approach. , 2009, Journal of vision.

[32]  Ariel Shamir,et al.  Cropping Scaling Seam carving Warping Multi-operator , 2009 .

[33]  Huchuan Lu,et al.  Saliency Detection via Graph-Based Manifold Ranking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Chao-Hung Lin,et al.  Patch-Based Image Warping for Content-Aware Retargeting , 2013, IEEE Transactions on Multimedia.

[35]  Jingdong Wang,et al.  Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[36]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[37]  Ralph R. Martin,et al.  Shrinkability Maps for Content‐Aware Video Resizing , 2008, Comput. Graph. Forum.

[38]  Ariel Shamir,et al.  Improved seam carving for video retargeting , 2008, SIGGRAPH 2008.

[39]  Hailin Jin,et al.  Composition-Preserving Deep Photo Aesthetics Assessment , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Bingbing Ni,et al.  Learning to photograph , 2010, ACM Multimedia.

[41]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[42]  J. Wolfe,et al.  What attributes guide the deployment of visual attention and how do they do it? , 2004, Nature Reviews Neuroscience.

[43]  H. Seidel,et al.  Motion-aware temporal coherence for video resizing , 2009, SIGGRAPH 2009.

[44]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2014, Computational Visual Media.

[45]  Wei Luo,et al.  Content-Based Photo Quality Assessment , 2013, IEEE Trans. Multim..

[46]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[47]  Wolfgang Effelsberg,et al.  GPU video retargeting with parallelized SeamCrop , 2014, MMSys '14.

[48]  Karol Myszkowski,et al.  Multidimensional image retargeting , 2011, SA '11.

[49]  Tong-Yee Lee,et al.  Motion-based video retargeting with optimized crop-and-warp , 2010, SIGGRAPH 2010.

[50]  Markus A. Stricker,et al.  Similarity of color images , 1995, Electronic Imaging.

[51]  Christof Koch,et al.  Image Signature: Highlighting Sparse Salient Regions , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Wen Gao,et al.  Spatiotemporal Grid Flow for Video Retargeting , 2014, IEEE Transactions on Image Processing.

[53]  Michael Dorr,et al.  Large-Scale Optimization of Hierarchical Features for Saliency Prediction in Natural Images , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Olga Sorkine-Hornung,et al.  A comparative study of image retargeting , 2010, ACM Trans. Graph..

[55]  Wei Liu,et al.  Nonnegative Local Coordinate Factorization for Image Representation , 2011, IEEE Transactions on Image Processing.

[56]  Chia-Wen Lin,et al.  Maintaining Temporal Coherence in Video Retargeting Using Mosaic-Guided Scaling , 2011, IEEE Transactions on Image Processing.

[57]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  Gabriela Csurka,et al.  Assessing the aesthetic quality of photographs using generic image descriptors , 2011, 2011 International Conference on Computer Vision.

[59]  Diego Gutierrez,et al.  Using eye-tracking to assess different image retargeting methods , 2011, APGV '11.