Interactive and real-time generation of home video summaries on mobile devices

With the proliferation of mobile devices and multimedia, videos have become an indispensable part of life-logs for personal experiences. In this paper, we present a real-time and interactive application for home video summarization on mobile devices. The main challenge of this method is lack of information about the video content in the following frames, which we term "partial-context" in this paper. First of all, real-time segmentation algorithm based on partial-context is applied to decompose the captured video into segments in line with the change in dominant camera motion. Secondly, the main challenge to conventional video summarization is the semantic understanding of the video content. Thus, we leverage the fact that it is easy to get user input on a mobile device and attack this problem through the user interaction. The user preference is learned and modeled by a Gaussian Mixture Model (GMM), which is updated each time when users manually select key frames. Evaluation results demonstrate that our system significantly improves user experience and provides an efficient automatic/semi-automatic video summarization solution for mobile users.

[1]  Jiebo Luo,et al.  Towards Extracting Semantically Meaningful Key Frames From Personal Video Clips: From Humans to Computers , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Thomas S. Huang,et al.  Exploring video structure beyond the shots , 1998, Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).

[3]  Roger Mohr,et al.  Mixture densities for video objects recognition , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[4]  Mubarak Shah,et al.  Automatic Segmentation of Home Videos , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[5]  Victor Vald&3233;s,et al.  Introducing risplayer: real-time interactive generation of personalized video summaries , 2010, SAPMIA '10.

[6]  Noel E. O'Connor,et al.  An interactive and multi-level framework for summarising user generated videos , 2009, ACM Multimedia.

[7]  Xin Li,et al.  Blind image quality assessment , 2002, Proceedings. International Conference on Image Processing.

[8]  José M. N. Leitão,et al.  On Fitting Mixture Models , 1999, EMMCVPR.

[9]  Kai-Kuang Ma,et al.  A new diamond search algorithm for fast block-matching motion estimation , 2000, IEEE Trans. Image Process..

[10]  Shingo Uchihashi,et al.  A semi-automatic approach to home video editing , 2000, UIST '00.

[11]  Xian-Sheng Hua,et al.  To learn representativeness of video frames , 2005, MULTIMEDIA '05.

[12]  Tao Mei,et al.  Modeling and Mining of Users' Capture Intention for Home Videos , 2007, IEEE Transactions on Multimedia.