论文信息 - Automatic multi-camera remix from single video

Automatic multi-camera remix from single video

In this paper we present a first of its kind automatic multi-camera video remix creation system from a single video, referred to as SmartView. We present a novel method which uses the fusion of multimodal content analysis and cinematic rules, for creating a multi-camera experience. Further, a playback metadata based model, which consists of playback instructions for metadata-aware media player, allows for providing a remix experience without editing the original video content. This approach produces a low footprint, which is suitable for on-device processing in resource constrained mobile devices. The research prototype demonstrates the feasibility of such a system with current off-the-shelf mobile devices. The SmartView creation process was seen to take less time than the video duration. 5 out of 9 test users found the fully automatic SmartView remix experience to be better than the conventional playback. The user customized SmartView remix was preferred over conventional playback.

Igor D. D. Curcio | Sujeet Mate | Arto Lehtiniemi | Antti J. Eronen

[1] Xing Xie,et al. Learning user interest for image browsing on small-form-factor devices , 2005, CHI.

[2] Igor D. D. Curcio,et al. Video as memorabilia: user needs for collaborative automatic mobile video production , 2012, CHI.

[3] Wei Tsang Ooi,et al. Combining content-based analysis and crowdsourcing to improve user interaction with zoomable video , 2011, ACM Multimedia.

[4] Bing-Yu Chen,et al. SmartPlayer: user-centric video fast-forwarding , 2009, CHI.

[5] Wei Tsang Ooi,et al. MoViMash: online mobile video mashup , 2012, ACM Multimedia.

[6] Anssi Klapuri,et al. Music Tempo Estimation With $k$-NN Regression , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[7] Daniel P. W. Ellis,et al. Beat Tracking by Dynamic Programming , 2007 .

[8] Wei Tsang Ooi,et al. Crowdsourced automatic zoom and scroll for video retargeting , 2010, ACM Multimedia.

[9] 塚田浩二. Windows Phone のプログラミング , 2010 .

[10] Geoffroy Peeters,et al. Joint Estimation of Chords and Downbeats From an Audio Signal , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[11] Igor D. D. Curcio,et al. We want more: human-computer collaboration in mobile social video remixing of music concerts , 2011, CHI.

[12] Larry S. Davis,et al. Multi-scale video cropping , 2007, ACM Multimedia.

[13] Ariel Shamir,et al. Cropping Scaling Seam carving Warping Multi-operator , 2009 .

[14] Wei Tsang Ooi,et al. Supporting zoomable video streams with dynamic region-of-interest cropping , 2010, MMSys '10.

[15] Wei Tsang Ooi,et al. Towards characterizing users' interaction with zoomable video , 2010, SAPMIA '10.

[16] Peter H. N. de With,et al. Automatic mashup generation from multiple-camera concert recordings , 2010, ACM Multimedia.

[17] Michael Gleicher,et al. Video retargeting: automating pan and scan , 2006, MM '06.

[18] Masatsugu Kidode,et al. Region extraction of a gaze object using the gaze point and view image sequences , 2005, ICMI '05.