A framework for virtual videography

There are a significant number of events that happen on a regular basis that would be worth preserving on video but for which it is impractical to use traditional video production methods. In this paper, we describe one possible way to inexpensively and unobtrusively capture and produce video in a classroom lecture environment. We discuss the importance of cinematic principles in the lecture video domain and describe guidelines that should be followed when capturing a lecture. We continue by surveying the tools provided by computer vision and computer graphics that allow us to determine syntactic information about images. Finally, we describe a way to combine these tools to create a framework for a Virtual Videography system, one that can automatically generate production quality video. This framework is based on the creation of region objects, a semantically related region of video, despite the fact that we can reliably only gather syntactic information.

[1]  Alex Waibel,et al.  Face locating and tracking for human-computer interaction , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[2]  Harpreet S. Sawhney,et al.  Compact Representations of Videos Through Dominant and Multiple Motion Estimation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Anoop Gupta,et al.  Building an intelligent camera management system , 2001, MULTIMEDIA '01.

[4]  David Bordwell,et al.  Film Art: An Introduction , 1979 .

[5]  Paul A. Beardsley,et al.  Computer Vision for Interactive Computer Graphics , 1998, IEEE Computer Graphics and Applications.

[6]  Mubarak Shah,et al.  A Computer Vision Framework for Analyzing Overhead and Computer Projections from Video of Lectures , 2001, Int. J. Comput. Their Appl..

[7]  David Salesin,et al.  The virtual cinematographer: a paradigm for automatic real-time camera control and directing , 1996, SIGGRAPH.

[8]  Brian Christopher Smith,et al.  Passive capture and structuring of lectures , 1999, MULTIMEDIA '99.

[9]  James C. Lester,et al.  Intelligent multi-shot visualization interfaces for dynamic 3D worlds , 1998, IUI '99.

[10]  S SawhneyHarpreet,et al.  Compact Representations of Videos Through Dominant and Multiple Motion Estimation , 1996 .

[11]  Michael Gleicher,et al.  Towards virtual videography (poster session) , 2000, ACM Multimedia.

[12]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 2004, International Journal of Computer Vision.

[13]  Dragutin Petkovic,et al.  CueVideo: a system for cross-modal search and browse of video databases , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[14]  Michael J. Black,et al.  Summarization of videotaped presentations: automatic analysis of motion and gesture , 1998, IEEE Trans. Circuits Syst. Video Technol..

[15]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[16]  David Bordwell,et al.  On the history of film style , 1997 .

[17]  Steven M. Drucker,et al.  CamDroid: a system for implementing intelligent camera control , 1995, I3D '95.

[18]  Richard Szeliski,et al.  Video textures , 2000, SIGGRAPH.

[19]  J. F. Kelley,et al.  An iterative design methodology for user-friendly natural language office information applications , 1984, TOIS.

[20]  John D. Gould,et al.  Composing letters with a simulated listening typewriter , 1982, CHI '82.

[21]  Michael Gleicher,et al.  Towards Virtual Videography , 2000 .