VideoPuzzle: Descriptive One-Shot Video Composition

A large amount of short, single-shot videos are created by personal camcorder every day, such as the small video clips in family albums, and thus a solution for presenting and managing these video clips is highly desired. From the perspective of professionalism and artistry, long-take/shot video, also termed one-shot video, is able to present events, persons or scenic spots in an informative manner. This paper presents a novel video composition system “Video Puzzle” which generates aesthetically enhanced long-shot videos from short video clips. Our task here is to automatically composite several related single shots into a virtual long-take video with spatial and temporal consistency. We propose a novel framework to compose descriptive long-take video with content-consistent shots retrieved from a video pool. For each video, frame-by-frame search is performed over the entire pool to find start-end content correspondences through a coarse-to-fine partial matching process. The content correspondence here is general and can refer to the matched regions or objects, such as human body and face. The content consistency of these correspondences enables us to design several shot transition schemes to seamlessly stitch one shot to another in a spatially and temporally consistent manner. The entire long-take video thus comprises several single shots with consistent contents and ίuent transitions. Meanwhile, with the generated matching graph of videos, the proposed system can also provide an efficient video browsing mode. Experiments are conducted on multiple video albums and the results demonstrate the effectiveness and the usefulness of the proposed scheme.

[1]  E. Minium Statistical reasoning in psychology and education , 1970 .

[2]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[3]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[4]  Richard Szeliski,et al.  Creating full view panoramic image mosaics and environment maps , 1997, SIGGRAPH.

[5]  Thomas D. C. Little,et al.  Automatic Composition Techniques for Video Production , 1998, IEEE Trans. Knowl. Data Eng..

[6]  George Wolberg,et al.  Image morphing: a survey , 1998, The Visual Computer.

[7]  Gershon Elber,et al.  Image Morphing with Feature Preserving Texture , 1999, Comput. Graph. Forum.

[8]  Shingo Uchihashi,et al.  An interactive comic book presentation for exploring video , 2000, CHI.

[9]  Richard Szeliski,et al.  Video textures , 2000, SIGGRAPH.

[10]  J. Cutting Representing Motion in a Static Image: Constraints and Parallels in Art, Science, and Popular Culture , 2002, Perception.

[11]  Marcel Worring,et al.  Systematic evaluation of logical story unit segmentation , 2002, IEEE Trans. Multim..

[12]  Irfan A. Essa,et al.  Graphcut textures: image and video synthesis using graph cuts , 2003, ACM Trans. Graph..

[13]  Jessica K. Hodgins,et al.  Flow-based video synthesis and editing , 2004, ACM Trans. Graph..

[14]  Michael R. Lyu,et al.  Video summarization by video structure analysis and graph optimization , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[15]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[16]  Andreas Girgensohn,et al.  Stained-glass visualization for highly condensed video summaries , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[17]  Lie Lu,et al.  Optimization-based automated home video editing system , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Irfan A. Essa,et al.  Video-based nonphotorealistic and expressive illustration of motion , 2005, International 2005 Computer Graphics.

[19]  Dani Lischinski,et al.  Dynamosaics: video mosaics with non-chronological time , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Mohan S. Kankanhalli,et al.  Analogies based video editing , 2005, Multimedia Systems.

[21]  Jung-Hwan Oh,et al.  Scenario based dynamic video abstractions using graph matching , 2005, MULTIMEDIA '05.

[22]  Jacob Scharcanski,et al.  Hierarchical Summarization of Diagnostic Hysteroscopy Videos , 2006, 2006 International Conference on Image Processing.

[23]  N. Nikolaidis,et al.  Video shot detection and condensed representation. a review , 2006, IEEE Signal Processing Magazine.

[24]  Yasuyuki Matsushita,et al.  Dynamic stills and clip trailers , 2006, The Visual Computer.

[25]  Yuan Li,et al.  High-Performance Rotation Invariant Multiview Face Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Leonard McMillan,et al.  Computational time-lapse video , 2007, SIGGRAPH '07.

[27]  Janko Calic,et al.  Efficient Layout of Comic-Like Video Summaries , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Andrew Zisserman,et al.  Near Duplicate Image Detection: min-Hash and tf-idf Weighting , 2008, BMVC.

[29]  Tao Mei,et al.  Video collage: presenting a video sequence using a single image , 2008, The Visual Computer.

[30]  Lucas Kovar,et al.  Motion Graphs , 2002, ACM Trans. Graph..

[31]  Chng Eng Siong,et al.  Automatic composition of broadcast sports video , 2008, Multimedia Systems.

[32]  Hujun Bao,et al.  Refilming with Depth-Inferred Videos , 2009, IEEE Transactions on Visualization and Computer Graphics.

[33]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[34]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Tinghuai Wang,et al.  An Evolutionary Approach to Automatic Video Editing , 2009, 2009 Conference for Visual Media Production.

[36]  Adam Finkelstein,et al.  Video tapestries with continuous temporal zoom , 2010, ACM Trans. Graph..

[37]  Shuicheng Yan,et al.  Robust Graph Mode Seeking by Graph Shift , 2010, ICML.

[38]  Shigeo Takahashi,et al.  Sophisticated Construction and Search of 2D Motion Graphs for Synthesizing Videos , 2010, 2010 Fourth Pacific-Rim Symposium on Image and Video Technology.

[39]  Leonidas J. Guibas,et al.  Image webs: Computing and exploiting connectivity in image collections , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Xian-Sheng Hua,et al.  Towards a Relevant and Diverse Search of Social Images , 2010, IEEE Transactions on Multimedia.

[41]  Michael Isard,et al.  Partition Min-Hash for Partial Duplicate Image Discovery , 2010, ECCV.

[42]  Carlos D. Correa,et al.  Dynamic video narratives , 2010, ACM Trans. Graph..

[43]  Ira Kemelmacher-Shlizerman,et al.  Exploring photobios , 2011, ACM Trans. Graph..

[44]  Ira Kemelmacher-Shlizerman,et al.  Exploring photobios , 2011, SIGGRAPH 2011.

[45]  Rui Hu,et al.  Stylized ambient displays of digital media collections , 2011, Comput. Graph..