Video^M: Multi-video Synopsis

Conventional video representation methods focus predominantly on a single video, aiming at reducing the space-time redundancy as much as possible, while this paper describes a novel approach to simultaneously presenting dynamics of multiple videos, aiming at a less intrusive viewing experience. Given a main video and multiple supplementary videos, the proposed approach automatically constructs a synthesized multi-video synopsis, called VideoM, by integrating the supplementary videos into the most suitable space-time portions within the main video. We formulate the problem of VideoM as a maximum a posterior (MAP) problem which maximizes the desired properties related to less intrusive viewing experience, i.e., informativeness, consistency, visual naturalness, and stability. This problem is solved by the Viterbi beam search algorithm to optimally find the suitable integration between the main video and supplementary videos.

[1]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[2]  Hermann Ney,et al.  Data driven search organization for continuous speech recognition , 1992, IEEE Trans. Signal Process..

[3]  P. Anandan,et al.  Mosaic based representations of video sequences and their applications , 1995, Proceedings of IEEE International Conference on Computer Vision.

[4]  Boon-Lock Yeo,et al.  Video visualization for compact presentation and fast browsing of pictorial content , 1997, IEEE Trans. Circuits Syst. Video Technol..

[5]  Takeo Kanade,et al.  Video skimming and characterization through the combination of image and language understanding , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[6]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[7]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[8]  Yasuyuki Matsushita,et al.  Space-Time Video Montage , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Yael Pritch,et al.  Making a Long Video Short: Dynamic Video Synopsis , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Tao Mei,et al.  VideoSense: towards effective online video advertising , 2007, ACM Multimedia.

[11]  Tao Mei,et al.  Video Collage: A Novel Presentation of Video Sequence , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[12]  Meng Wang,et al.  MSRA-USTC-SJTU at TRECVID 2007: High-Level Feature Extraction and Search , 2007, TRECVID.