Video retargeting with multi-scale trajectory optimization

Mobile devices are increasingly powerful in media storage and rendering. The prevalent request of decent video browsing on mobile devices is demanding. However, one limitation comes from the size and aspect constraints of display. To display a video on a small screen, rendering process probably undergoes a sort of retargeting to fit into the target display and keep the most of original video information. In this paper, we formulate video retargeting as the problem of finding an optimal trajectory for a cropping window to go through the video, capturing the most salient region to scale towards proper display on the target. To measure the visual importance of every pixel, we utilize the local spatial-temporal saliency (ST-saliency) and face detection results. The spatiotemporal movement of the cropping window is modeled in a graph where a smoothed trajectory is resolved by a Max-Flow/Min-Cut method in a global optimization manner. Based on the horizontal/vertical projections and a graph-based method, the trajectory estimation of each shot can be conducted within one second. Also, the process of merging trajectories is employed to capture more saliency in video. Experimental results on diverse video contents have shown that our approach is efficient and subjective evaluation shows that the retargeted video has gained desirable user satisfaction.

[1]  S. Engel,et al.  Colour tuning in human visual cortex measured with functional magnetic resonance imaging , 1997, Nature.

[2]  Michael R. Lyu,et al.  A comprehensive method for multilingual video text detection, localization, and extraction , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Ariel Shamir,et al.  Improved seam carving for video retargeting , 2008, ACM Trans. Graph..

[5]  Daniel Cohen-Or,et al.  Non-homogeneous Content-driven Video-retargeting , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[6]  Michael Gleicher,et al.  Video retargeting: automating pan and scan , 2006, MM '06.

[7]  Xing Xie,et al.  Looking into video frames on small displays , 2003, ACM Multimedia.

[8]  Nuno Vasconcelos,et al.  The discriminant center-surround hypothesis for bottom-up saliency , 2007, NIPS.

[9]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Hermann Ney,et al.  Pan, zoom, scan — Time-coherent, trained automatic video cropping , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Hans Knutsson,et al.  Signal processing for computer vision , 1994 .

[12]  Ling-Yu Duan,et al.  Consumer video retargeting: context assisted spatial-temporal grid optimization , 2009, ACM Multimedia.

[13]  Chong-Wah Ngo,et al.  Motion analysis and segmentation through spatio-temporal slices processing , 2003, IEEE Trans. Image Process..

[14]  Mohan S. Kankanhalli,et al.  Video content representation on tiny devices , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[15]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[16]  Wen Gao,et al.  A dataset and evaluation methodology for visual saliency in video , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[17]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[18]  Pierre Baldi,et al.  A principled approach to detecting surprising events in video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Xing Xie,et al.  Browsing large pictures under limited display sizes , 2006, IEEE Transactions on Multimedia.

[20]  Benjamin B. Bederson,et al.  Automatic thumbnail cropping and its effectiveness , 2003, UIST '03.

[21]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[22]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[23]  Larry S. Davis,et al.  Multi-scale video cropping , 2007, ACM Multimedia.

[24]  Wen Gao,et al.  Enhancing Human Face Detection by Resampling Examples Through Manifolds , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[25]  J.-L. Wu,et al.  Video Adaptation for Small Display Based on Content Recomposition , 2007, IEEE Transactions on Circuits and Systems for Video Technology.