SmartCamera: a low-cost and intelligent camera management system

Intelligent camera management systems were developed to automatically record meetings for videoconferencing. These systems provided many benefits, such as reducing the production cost and conveniently documenting events. However, automatically recorded videos in general were not visually engaging. This paper presents a novel approach that intelligently controls camera shots and angles to improve the visual interest. We use 3D infrared images captured by a Kinect sensor to recognize active speakers and their positions in a meeting. A movable camera, constructed by placing a wireless PTZ (pan-tilt-zoom) camera on top of a motorized rail, can automatically move its position to frame an active speaker in the center of the screen. Without interrupting the meeting, a speaker can seamlessly switch video sources through gesture-based commands. We have summarized and implemented a set of heuristic rules to simulate a human director. These rules can be visually edited through a graphical user interface. The customization of a virtual director makes our system applicable in various scenarios. We conducted a user study, and the evaluation results justified the quality of an automated video.

[1]  Andrew Jones,et al.  Achieving eye contact in a one-to-many 3D video teleconferencing system , 2009, ACM Trans. Graph..

[2]  A. Rubin The uses-and-gratifications perspective of media effects. , 2002 .

[3]  Mico Dujak,et al.  Kinect-based presenter tracking prototype for videoconferencing , 2014, 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[4]  Michael Gleicher,et al.  Virtual videography , 2007, TOMCCAP.

[5]  Takayuki Nagai,et al.  Automated lecture recording system with AVCHD camcorder and microserver , 2009, SIGUCCS '09.

[6]  Tim Roberts,et al.  Multi-Kinect Tracking for Dismounted Soldier Training , 2012 .

[7]  Michael S. Brandstein,et al.  Microphone Arrays - Signal Processing Techniques and Applications , 2001, Microphone Arrays.

[8]  Alexander L. Ronzhin,et al.  A Video Monitoring Model with a Distributed Camera System for the Smart Space , 2010, NEW2AN.

[9]  Yutaka Matsushita,et al.  Learning from TV programs: application of TV presentation to a videoconferencing system , 1995, UIST '95.

[10]  Berna Erol,et al.  Portable meeting recorder , 2002, MULTIMEDIA '02.

[11]  Abhishek Ranjan,et al.  Automatic camera control using unobtrusive vision and audio tracking , 2010, Graphics Interface.

[12]  John S. Boreczky,et al.  FlySPEC: a multi-user video camera system with hybrid human and automatic control , 2002, MULTIMEDIA '02.

[13]  Javier Ruiz Hidalgo,et al.  Real-Time Head and Hand Tracking Based on 2.5D Data , 2011, IEEE Transactions on Multimedia.

[14]  Shin'ichi Satoh,et al.  Human gesture recognition system for TV viewing using time-of-flight camera , 2011, Multimedia Tools and Applications.

[15]  Don Kimber,et al.  FlyCam: practical panoramic video , 2000, ACM Multimedia.

[16]  John W. McDonough,et al.  A joint particle filter for audio-visual speaker tracking , 2005, ICMI '05.

[17]  Brian Christopher Smith,et al.  Passive capture and structuring of lectures , 1999, MULTIMEDIA '99.

[18]  Jack Kuney,et al.  Take One: Television Directors on Directing , 1990 .

[19]  Anoop Gupta,et al.  Distributed meetings: a meeting capture and broadcasting system , 2002, MULTIMEDIA '02.

[20]  John R. Zhang,et al.  Upper body gestures in lecture videos: indexing and correlating to pedagogical significance , 2012, ACM Multimedia.

[21]  Yuichi Nakamura,et al.  Smart meeting systems: A survey of state-of-the-art and open issues , 2010, CSUR.

[22]  Chong-Wah Ngo,et al.  Simulating a Smartboard by Real-Time Gesture Detection in Lecture Videos , 2008, IEEE Transactions on Multimedia.

[23]  Andrew Jones,et al.  Achieving eye contact in a one-to-many 3D video teleconferencing system , 2009, SIGGRAPH 2009.

[24]  Petr Motlícek,et al.  Real-Time Audio-Visual Analysis for Multiperson Videoconferencing , 2013, Adv. Multim..

[25]  Chong-Wah Ngo,et al.  Lecture Video Enhancement and Editing by Integrating Posture, Gesture, and Text , 2007, IEEE Transactions on Multimedia.

[26]  Hilary Buxton,et al.  Visually Mediated Interaction Using Learnt Gestures and Camera Control , 2001, Gesture Workshop.

[27]  Anoop Gupta,et al.  Viewing meeting captured by an omni-directional camera , 2001, CHI.

[28]  Xiang Cao,et al.  Time travel proxy: using lightweight video recordings to create asynchronous, interactive meetings , 2012, CHI.

[29]  James Norris,et al.  CamBlend: an object focused collaboration tool , 2012, CHI.

[30]  Steven E. Poltrock,et al.  Requirements for a virtual collocation environment , 1997, Inf. Softw. Technol..

[31]  Abhishek Ranjan,et al.  Improving meeting capture by applying television production principles with audio and motion detection , 2008, CHI.

[32]  Abhishek Ranjan,et al.  An exploratory analysis of partner action and camera control in a video-mediated collaborative task , 2006, CSCW '06.

[33]  Anoop Gupta,et al.  Automating camera management for lecture room environments , 2001, CHI.

[34]  Myung-Suk Song,et al.  An Interactive 3-D Audio System With Loudspeakers , 2011, IEEE Transactions on Multimedia.

[35]  Anoop Gupta,et al.  Videography for telepresentations , 2003, CHI '03.