Framework for Script Based Virtual Directing and Multimedia Authoring in Live Video Streaming

We propose a novel framework that facilitates automatic editing and authoring of multimedia using static and moving cameras in a dynamic scene. The framework incorporates several video techniques such as object tracking using mean shift and object recognition using Scaled Invariant Feature Transform (SIFT). These techniques are linked together by a comprehensive yet simple-to-program script authoring mechanism based on video event detection. These combined features empower the system to play a virtual director role in live video stream editing and multimedia integration. The system requires minimum human intervention and can leverage production efficiency for both novice and professional users. The experimental results from our prototype system demonstrate that this framework is achievable using inexpensive hardware and standard video cameras. Our system provides comprehensive pre-production authoring capabilities that lend towards integration of video and heterogonous multimedia elements in realtime. We have found this framework to be useful in many applications such as live video streaming, distance education, live entertainment, sports coverage and personal video broadcasting.

[1]  Michael Bosse,et al.  Non-metric image-based rendering for video stabilization , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[2]  Michael N. Wallick,et al.  A framework for virtual videography , 2002, SMARTGRAPH '02.

[3]  R.Y.D. Xu,et al.  Robust mean-shift tracking with extended fast colour thresholding , 2004, Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004..

[4]  Mubarak Shah,et al.  A hierarchical approach to robust background subtraction using color and gradient information , 2002, Workshop on Motion and Video Computing, 2002. Proceedings..

[5]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[6]  Chong-Wah Ngo,et al.  Motion analysis and segmentation through spatio-temporal slices processing , 2003, IEEE Trans. Image Process..

[7]  Dorin Comaniciu,et al.  Mean shift analysis and applications , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[8]  Janusz Konrad,et al.  Probabilistic video stabilization using Kalman filtering and mosaicing , 2003, IS&T/SPIE Electronic Imaging.

[9]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Luc Van Gool,et al.  Object Tracking with an Adaptive Color-Based Particle Filter , 2002, DAGM-Symposium.

[11]  Anoop Gupta,et al.  Videography for telepresentations , 2003, CHI '03.

[12]  Aaron F. Bobick,et al.  Fast Lighting Independent Background Subtraction , 2004, International Journal of Computer Vision.

[13]  Yong Rui,et al.  A portable solution for automatic lecture room camera management , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[14]  Fatih Porikli,et al.  Human Body Tracking by Adaptive Background Models and Mean-Shift Analysis , 2003 .

[15]  Diane Harley,et al.  BIBS: A Lecture Webcasting System , 2001 .