Compact Video Synopsis via Global Spatiotemporal Optimization

Video synopsis aims at providing condensed representations of video data sets that can be easily captured from digital cameras nowadays, especially for daily surveillance videos. Previous work in video synopsis usually moves active objects along the time axis, which inevitably causes collisions among the moving objects if compressed much. In this paper, we propose a novel approach for compact video synopsis using a unified spatiotemporal optimization. Our approach globally shifts moving objects in both spatial and temporal domains, which shifting objects temporally to reduce the length of the video and shifting colliding objects spatially to avoid visible collision artifacts. Furthermore, using a multilevel patch relocation (MPR) method, the moving space of the original video is expanded into a compact background based on environmental content to fit with the shifted objects. The shifted objects are finally composited with the expanded moving space to obtain the high-quality video synopsis, which is more condensed while remaining free of collision artifacts. Our experimental results have shown that the compact video synopsis we produced can be browsed quickly, preserves relative spatiotemporal relationships, and avoids motion collisions.

[1]  Zygmunt Pizlo,et al.  Automated video program summarization using speech transcripts , 2006, IEEE Transactions on Multimedia.

[2]  Denis Simakov,et al.  Summarizing visual data using bidirectional similarity , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Chengfang Song,et al.  Video completion and synthesis , 2008 .

[4]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[5]  William T. Freeman,et al.  The Patch Transform , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Tie-Yan Liu,et al.  Shot reconstruction degree: a novel criterion for key frame selection , 2004, Pattern Recognit. Lett..

[8]  S. Avidan,et al.  Seam carving for content-aware image resizing , 2007, SIGGRAPH 2007.

[9]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[10]  Marc Levoy,et al.  Fast texture synthesis using tree-structured vector quantization , 2000, SIGGRAPH.

[11]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Jenq-Neng Hwang,et al.  An integrated scheme for object-based video abstraction , 2000, ACM Multimedia.

[13]  Yongwei Nie,et al.  Fast multi-scale joint bilateral texture upsampling , 2009, The Visual Computer.

[14]  Scott Cohen,et al.  Background estimation as a labeling problem , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  P. KaewTrakulPong,et al.  An Improved Adaptive Background Mixture Model for Real-time Tracking with Shadow Detection , 2002 .

[16]  Harry Shum,et al.  Image completion with structure propagation , 2005, ACM Trans. Graph..

[17]  Yael Pritch,et al.  Making a Long Video Short: Dynamic Video Synopsis , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Nipun Kwatra,et al.  Texture optimization for example-based synthesis , 2005, ACM Trans. Graph..

[19]  Leonard McMillan,et al.  Computational time-lapse video , 2007, SIGGRAPH '07.

[20]  Jintao Li,et al.  Replay boundary detection in MPEG compressed video , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[21]  Nebojsa Jojic,et al.  Adaptive Video Fast Forward , 2005, Multimedia Tools and Applications.

[22]  Yael Pritch,et al.  Webcam Synopsis: Peeking Around the World , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[23]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[24]  Yongwei Nie,et al.  Fast Exact Nearest Patch Matching for Patch-Based Image Editing and Processing , 2011, IEEE Transactions on Visualization and Computer Graphics.

[25]  Yasuyuki Matsushita,et al.  Space-Time Video Montage , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  Jeho Nam,et al.  Video abstract of video , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).

[27]  Harry Shum,et al.  Background Cut , 2006, ECCV.

[28]  Jianping Fan,et al.  Exploring video content structure for hierarchical summarization , 2004, Multimedia Systems.

[29]  Mohan S. Kankanhalli,et al.  Perspectives on Content-Based Multimedia Systems , 2000, The Information Retrieval Series.

[30]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[31]  William T. Freeman,et al.  The patch transform and its applications to image editing , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Yael Pritch,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008 1 Non-Chronological Video , 2022 .

[33]  Shrikanth Narayanan,et al.  Movie Content Analysis, Indexing and Skimming Via Multimodal Information , 2003 .

[34]  Wook Sung Kim,et al.  Abscess Transformation of Intracardiac Hematoma and Ventricular Rupture after Double‐Patch Repair of Postinfarction Ventricular Septal Defect , 2010, Journal of cardiac surgery.

[35]  Ying Li,et al.  An Overview of Video Abstraction Techniques , 2001 .

[36]  Eli Shechtman,et al.  Space-time video completion , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[37]  C.-C. Jay Kuo,et al.  Video Content Analysis Using Multimodal Information , 2003, Springer US.

[38]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .